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Preface 



The Asian Computing Science Conference (ASIAN) series was initiated in 1995 
to provide a forum for researchers in computer science in Asia to meet and 
to promote interaction with researchers from other regions. The previous five 
conferences were held, respectively, in Bangkok, Singapore, Kathmandu, Manila, 
and Phuket. The proceedings were published in the Lecture Notes in Computer 
Science Series of Springer-Verlag. 

This year’s conference (ASIAN 2000) attracted 61 submissions from which 18 
papers were selected through an electronic program committee (PC) meeting. 

The themes for this year’s conference are: 

— Logics in Computer Science 

— Data Mining 

— Networks and Performance 

The key note speaker for ASIAN 2000 is Jean Vuillemin (ENS, France) and 
the invited speakers are Ramamohanarao Kotagiri (U. Melbourne, Australia) 
and Alain Jean-Marie (LIRMM, France). We thank them for accepting our in- 
vitation. 

This year’s conference is sponsored by the Asian Institute of Technology 
(Thailand), INRIA (France), the National University of Singapore (Singapore), 
and UNU/IIST (Macau SAR, China). We thank all these institutions for their 
continued support of the ASIAN series. 

This year’s conference will be held in Penang, Malaysia. We are much obliged 
to Universiti Sains Malaysia and Penang State Government for providing the 
conference venue and to Dr. Abdullah Zawawi Haji Talib for making the local 
arrangements. 

We also wish to thank the PC members and the large number of referees for 
the substantial work put in by them in assessing the submitted papers. 

Finally, it is a pleasure to acknowledge the friendly and efficient support 
provided by Alfred Hofmann and his team at Springer-Verlag in bringing out 
this volume. 
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Finite Digital Synchronous Circuits 
Are Characterized by 2- Algebraic Truth Tables 



Jean Vuillemiii^ 

Ecole Normale Superieure, 45 rue d’Ulm, 75230 Paris cedex 05, France. 

Jean . VuilleminOens . f r 



Abstract. A digital function maps sequences of binary inputs, into se- 
quences of binary outputs. It is causal when the output at cycle n is a 
boolean function of the input, from cycles 0 through n. 

A causal digital function / is characterized by its truth table, an infinite 
sequence of bits (Fn) which gathers all outputs for all inputs. It is iden- 
tified to the power series ^ with coefficients in the two elements 

field F 2 . 

Theorem 1. A digital function can be computed by a Hnite digital syn- 
chronous circuit, if and only if it is causal, and its truth table is an 
algebraic number over¥ 2 [z\, the field of polynomial fractions (mod 2). 

A data structure, recursive sampling, is introduced to provide a canoni- 
cal representation, for each finite causal function /. It can be mapped, 
through finite algorithms, into a circuit SDD{f), an automaton SBA(f), 
and a polynomial poly(/); each is characteristic of /. One can thus auto- 
matically synthesize a canonical circuit, or software code, for computing 
any finite causal function /, presented in some effective form. Through 
recursive sampling, one can verify, in finite time, the validity of any hard- 
ware circuit or software program for computing /. 



1 Physical Deterministic Digital System 

Consider a discrete time digital system: at each integer cycle n^, the system 
receives input bits Xn € B = {0, 1}, and emits output bits z/n € B. The function 
f of this system, is to map infinite sequences of input bits x = (xn), into 
infinite sequences of output bits y = f{x) = (yw)- Call digital such a function 
/ e D ^ D, where D = N ^ B is the set of infinite binary sequences. Our 
aim is to characterize which functions can be computed by deterministic digital 
physical systems, such as electronic circuits, and which cannot. 

To simplify, we exclude analog [1], and asynchronous systems. As long as the 
function of such exotic systems remains deterministic, and digital, an equivalent 
system may be implemented through a digital synchronous electronic chip. The 
concept of Digital Synchronous Circuit DSC, provides a mathematical model 
for the form and function of this class of physical systems. 

^ Throughout this text, reserve the letter n to range over the natural numbers: n 6 N. 
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Established techniques exist to map finite DSC descriptions, into silicon 
chips [2], With reconfigurable systems [3], the process can be fully automated: 
from a finite DSC representation for mathematically computing /, compile a 
binary configuration, and download into some programmable device, in order 
to physically compute /. Without further argument, admit here that the class 
of functions defined by finite DSC captures the proper mathematical concept, 
from the motivation question. No regard is given to size limitations, arising from 
technology, economics, or else. 

— Physical circuits are constrained by time causality: output y{t) at time t may 
only depend upon inputs x{t'), from the past t' < t. 

— From their physical nature, electronic circuits must be finite. 

Causality and finiteness are thus necessary conditions, for digital functions 
to be computable by deterministic physical devices. We show that they are suf- 
ficient, and characterize finite causal functions, in a constructive way. 

1.1 Infinite SDD Procedure 

A first answer to the motivating question is provided in [4], through an infinite 
construction, the Synchronous Decision Diagrams SDD. 

Theorem 2 (Vuillemin [4]). 

1. To any causal function f, one can associate a canonical circuit 
SDD{f) G DSC for computing f . 

2. Circuit SDD{f) is finite, if and only if function f is computable by some 
finite system. 

Yet, the infinite SDD construction, relies on the ability to test for equality g = h 
between digital functions g, h. This operation is not computable in general, even 
when g and h are both computable. Also, the definition of ’’finiteness” is not 
made explicit in Theorem 2, and the input to the ’’procedure” is ill-specified. 

Such limitations are partly removed, by Berry [5] and Winkelman [6]: both 
base implementations of the SDD procedure, on representing digital functions 
by a Finite State Machines FSM. 

2 Binary Algebra 

Infinite binary sequences D N B have a rich mathematical structure. A 
digital sequence a G D codes, in a unique way: the set {a} = {n : On = 1} of 
integers; the formal power series a{z) = the 2-adic ’’integer” a{2) = 

D p(N) F 2 (-z) Z 2 . We identify all representations, and write 

(see [4]), for example: 

(01) = {1 -I- 2n} = --= z/{l + z^). 




Finite Digital Synchronous Circuits 



3 



Binary Algebra imports all underlying operations, into a single structure: 
(D,^,U,n,2:,2:“,©,(8),+, 

1. (D, U, n) is a Boolean Algebra, isomorphic to sets p(N) of integers; 

2. (D,0,©, ©) is a ring, isomorphic to the formal power series F 2 {z); 

3. (D,0,1,+, — , x) is a ring, isomorphic to the 2-adic integers Z 2 . 

The up-sampling operator is noted '[ x = x(z^) = x ® x. The down-sampling 
operator is noted [ x =[ (xn) = (x 2 n)- See the related Noble identities, in the 
appendix. 

In addition to the axiomatic relations implied by each of the three structures 
in D, hybrid relations exist between the operators in Binary Algebra. Some 
are listed in the appendix. There are more: indeed, each arithmetical circuit 
implements some hybrid relation [4]. For example, base -2 coding, is defined by 

^ Xfc2'= = ^ yfc(-2)'= (mod 2"). (1) 

fc<N fe<N 

It is also known as Booth coding y = booth(x), Polish code (in [7]), and may be 
computed by the hybrid formula: 

booth(x) = (01) ffi (x + (01)). 

The infinite Binary Algebra D = Z 2 , contains noteworthy sub-structures: 

F 2 C C N C Z C P2 C P C A 2 C Z 2 C Z 2 . 

Here: are the finite sequences, P the ultimately periodic sequences, P2 those 

of period length 2^^, A 2 the 2-algebraic (definition 6), and Z 2 the computable 2- 
adic integers. The appendix lists the closure properties of these sets, with respect 
to Binary Algebra operations. 

3 Causal Function 

Let ||x|| e Q denote the 2-adic norm of x G D: ||0|| = 0, ||1 + zx\\ = 1, and 
j^xjl = ||x||/2. The distance ||a — 6||, between digital sequences a, 6 G D, is 
ultra-metric: ||a + 6|| < max{||a||, ||6||}. Note that: ||a — 6|| = j|a © 6||. 

Definition 1. A digital function is causal, when the following (eguivalent state- 
ments) hold: 

1 . Va,6 G D : ||/(a) - /(6)|| < ||a - 6||. ^ ^ 

2. Each output hit is a Boolean function /n G ^ B, which exclusively 

depends on the first n + 1 bits of input: 

Vn = /n(xoXi • ■ ■ Xn-1 Xn) = /n(x), 

y = (i/n) = /(x) = (/n(x)) = x; /n(x)2;^. 

The operators -i, n, U, ©, x, ©, 0, , x, / are causal. The antiflop z~ (de- 

fined by j/n = Xn-k), and down-sampling | are not causal. We simply say causal 
/, when / is a causal digital function, with a single input x, and a single output 
y = /(x); otherwise, we explicitly state the number of inputs, and outputs. 
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3.1 Truth Table 

Definition 2. The truth table of a causal function f{x) = (/n(x)), combines 
the tables for each Boolean function /n G B, into a unique digital 

sequence truth{f) = F = (f n) G D, defined by 

-^N — fmibo 6l ■ ■ ■ 6m— 1 6m): 

here: m = [log 2 {:^ + 2)J - 1, and J2k<m = n + 2 - 

Proposition 1. The truth table F = truth{f) e T) is a one-to-one digital code, 
for each causal function f = truthF {F) G D i— > D. 



Proposition 2. For causal f and g 



truth{^f) = ~^truth{f), 
truth{f U g) = truth{f ) U truth{g), 
truth{f r\ g) = truth{f) n truth{g), 
truth{zf) = 1 + z 'I z truth{f). 



3.2 Automatic Sequence 

Although it is traditionally associated to a finite causal /, which is explicitly 
presented by a finite state automaton, the definition of an automatic sequence 
[9], may be extended to all causal functions, finite and infinite. 

Definition 3. The automatic sequence auto{f) = (on) G D, is associated to 
the causal function f, by: 

— fmibc 6i • ■ • btri—l bm), 

where m = 0 i/ n = 0, else m = [log 2 {:^)\ , and n = J2k<m 

In general, the value y = (y-t^) = f{x) of causal /, at x = (xn), cannot be 
reconstructed, from its automatic sequence auto(/). Indeed, consider the causal; 
firstbit(x) = xfll, and zerotest(x) = x0x). Both have the same automatic 

number: auto(firstbit) = auto(zerotest) = 1(0). While truth(firstbit) = 1, we 
have T = truth(zerotest) = 101000100000001000000000000000100 ■ ■ ■ ^ 1. 

Proposition 3. Let f be causal. The derived causal functions, g{x) = f{^z~^x) 
and h{x) = zf{z~x), are such that: 

auto{f) = truth{g), 
truth{f) = z~‘^auto{h). 



3.3 Time Reversal 

Definition 4. The time reversed function f, is defined by 

I{x) = ^/n(Xn • ■ •Xo)2:"^, 
where the causal function f is given ( definition 1 ) by: 

f { x ) = ^/n(xo ■ • ■Xn)2;'^. 
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The reversed truth table truth(/) = (-F-), is related to truth(/) = (F^) through: 

N = (0 1 2 4 3 5 6 10 8 12 7 11 9 13 14 22 •• •). 

Let prefix(/) = + z^x) : a, & e N, a < 2^}, and suflix(/) = prefix(/). 

Proposition 4. The class of causal functions is closed under composition, pre- 
fix, suffix, and time reversal operations. 




Fig. 1. Sequential Decision Tree, for the truth table (F^). 



4 Universal Causal Machines 

4.1 Sequential Decision Tree 

Definition 5. The Sequential Decision Tree sdt{f), for computing causal f , is 
a complete infinite binary tree - fig. 1. A digital input x G D, specifies a unigue 
path through the tree: start at the root, for cycle 0; at cycle n, move down, to the 
left if Xt^ — 0, right otherwise. Arcs in the tree are labeled, in hierarchical order, 
by bits from the time reversed truth{f). Output y = f{x), is the digital sequence 
of arc labels, along the path specified by input x. 



4.2 Sequential Multiplexer 

A Digital Synchronous circuit DSC is obtained, by composing primitive com- 
ponents: the register reg, and Boolean (combinational, memoryless) operators. 
There is a restriction on composition: all combinational paths, through a chain 
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Fig. 2. Sequential multiplexer, for the truth table 



of Boolean operators, must be finite. This implies that each feedback loop must 
contain, at least one memory element reg (positive feedback). 

The operators (regp, reg;^, mux) serve as a base, for the SDD procedure: reg- 
isters regg(a::) = reg(x) = zx = 2x, reg;^(x) = -'Z-'X = 1-1- 2x, and multiplexer 
mux(c, 6, a) = (c n 6) -I- (-ic n a) = c n (6 © a) © a. 

The sequential multiplexer SM{f), from [4], is shown in fig. 2. The registers 
in SM{f), are labeled, 0 for regg and 1 for reg^, by truth(/), in direct order. 



4.3 Share Common Expressions 

The next step, in the infinite SDD construction [4], is to share all common 
sub-expressions, which appear in the process: the result is the Sequential Deci- 
sion Diagram SDD(/), for SM - see fig. 4. Similarly, for SDT, we obtain the 
Sequential Binary Automaton SBA(/) - see fig. 3. 



5 Finite Causal Function 

The causal functions mentioned so far may all be realized by finite circuits, and 
finite state machines FSM, except for x, /, ©, 0 and |, which are infinite [4]. 

Definition 6. Digital sequence b is 2-algebraic, when b(z) = is alge- 

braic over F 2 (z). Let A 2 denote the set of 2-algebraic sequences. 
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T{z) = truth(zerotest) is 2-algebraic, as root of: 1 + T + z'^T'^ = 0 (mod 2). 

Proposition 5. Causal f is finite, if and only if the following equivalents hold: 

a f is computed by a finite circuit DSC; 
b f is computed by a finite state machine FSM; 
c prefix{f) is finite; 
d suffix{f) is finite; 
e truth{f) is 2-algebraic. 

Proof: The equivalence between (a) and (b) is well-known. The equivalence 
between (b), (c) and (d) follows from classical automata theory [7]. 

The equivalence between (d) and (e) is established, through a result in the 
theory of automatic sequences. Call 2-automatic, a sequence a £ D, such that 
a = auto(/), for some FSM /. 

Theorem 3 (Christol, Kamae, Mendes Prance, Ranzy [11]). 

A digital sequence is 2-automatic, if and only if it is 2-algebraic. 

Combine Theorem 3 with Proposition 3, to complete the proof of Proposition 5, 
hence that of Theorem 1. ■ 

Proposition 6. Finite causal functions, are closed, under composition, prefix, 
suffix, and time reversal. 



Theorem 4. The class A 2 , of 2-algebraic sequences, is closed under: 

1. Boolean operations ^,U,n, and shifts z,z~; 

2. carry-free polynomial operations 0, 0, 0; 

3. up-sampling, down-sampling, and time reversal; 

4 . application of any finite causal function, hence 0 , — . 

Proof: Boolean closure follows from Proposition 2. Polynomial manipulations 
show the closure under carry-free operations: ©, 0, and shifts. Item 3 follows 
from Theorem 6. A novel construction is given, for proving item 4. It implies, 
in particular, that A 2 is closed under ordinary addition, and subtraction, with 
carries. We conjecture that A 2 is also closed under multiplication x , and division 
/• 

5.1 Transcendental Nnmbers 

If one interprets a digital sequence x = x{z) = x(2) in base rather than 2 
or z, one gets a real number: x{l/2) e R. To each causal /, associate the real 
number real(/) = truth(/)(l/2) e R. 

Theorem 5 (Loxton, van der Poorten [12]). If a{z) £ A 2 is 2-algebraic, 
then, either a(^) £ Q is rational, or it is transcendental, in the usual sense over 

Q 
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As a consequence, real(zerotest) = 1.2656860360875726 • is transcendental, 

over Q. Similarly, for real(booth) = 0.6010761186771489 • ■ ■. 

Up-sampling y =] x is causal, and t/n = /n(xo ■ ■ ■ Xn) is the middle bit: ?/2n = 
0, and y 2 N+i = Xn- The middle hit sequence is the truth table M = truth(|): 
M = 010000001100110000000000000000000011110000111100 ■■■. No finite cir- 
cuit exists, to implement up-sampling [4]. It follows, from Theorem 1, that the 
middle bit series M{z) is transcendental over F 2 [ 0 ]. Similarly for truth(®), and 
truth(x). It is not known, if real(|) = 0.5062255860470657 ■ • ■ is transcendental 
over Q, or not; similarly for real(®) and real(x). 



6 Finite SDD Procedure 

For / causal and finite, define size(/) as the number of states, in the minimal 
FSM (see [7]), for computing /. For F G D, define S = sample(F), as 

S={F}U{z- lS)u{z~ i^-5), 

where the least fixed point S e p(D), is a set of digital sequences. 

Theorem 6. Each of the following (equivalent statements), provides a canoni- 
cal representation for f finite causal, with size{f) = n, and F = truth(f) . 

1. sample{F) is finite, of size n. 

2. SBA{f) is the minimal FSM for computing f with n states. 

3. SDD{f ) is a finite DSC circuit, with n multiplexers, and at most 2n regis- 
ters, regg or regj^ . 

F = truth{f) is the unique 2-algehraic solution, to the system quadra(f), 
made of n binary quadratic equations. 

This is established through an effective algorithm - recursive sampling - and data 
structure. In this extented abstract, we simply present the (computer generated) 
output from the procedure, for one example: Booth coding, as defined by (1), 
and where size(booth) = 4. 



6.1 Recursive Sampling 

For fl = truth(booth), compute sample(/l) = {/I, /2, /5, /ll}: 

/I = 010011001111000000001111111100000000111111111111 ■ • • 
/2 = 010110000111100001111111100000000000000001111111 ■ • • 
/5 = 100110011110000000011111111000000001111111111111 • ■ ■ 
/ll = 101100001111000011111111000000000000000011111111 • ■ ■ 
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Fig. 3. The automaton SBA(booth), where booth(z) = (01) © (x + (01)). 



6.2 SB A Procedure 



6.3 Characteristic Circuit Polynomial 

A binary quadratic equation has the form: f = a + bz + + z^h?' (mod 2), 

for a, 6 e F 2 , and f,g,h e D. Truth tables in sample(booth) = {/I, /2, /5, /ll} 
are related by the following system of binary quadratic equations: 

fl = z + z^l + z)f2^, 
f2 = z + z^fl^ + z^f5^, 

/5 = 1 + 02 / 2 ^ + ^ 3 / 112 , 

/11 = 1 + 02(i + ^)/52. 



Through quadratic elimination, derive quad(F): 



quad(booth) = a + bF + c^‘^F-\-d']'^F (mod 2), 
a = z + z^ + z^ + z^ + z^^ + z'^^ + 032 



& = 1 + 0 + 02 + 03 , 

C = 0^(1 + 0 + 02 + 03 + 2^)2^ 

d = 023(1 + 0 + 02 + 03 + 



Through algebraic simplifications, obtain the irreducible characteristic polyno- 
mial poly(booth), of which F = truth(booth) = /I is the only root: 

F = Z + Z^ + z^ + 0^(1 + 0 + 02 + 03)+4 (jnod 2 ). 

A decimal expression for poly (booth): F = 50 + 240 |2 F. 
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Fig. 4. The circuit SDD (booth). 



6.4 SDD procedure 

The circuit synthesized by the SDD procedure involves the time reversed do- 
main. To keep the correspondence with Theorem 6.3, we show the circuit 
SDD(booth), in fig. 4. This circuit computes the function booth, defined through 
time reversal in equation (1). For SDD(booth) SBA(booth), one finds 15 
states. 

7 Feed-forward Circuit 

Proposition 7 (Feed-forward circuit). The following are characteristic equiv- 
alents, for finite causal f to he free of feed-back.- 

1. SDD{f) is acyclic; 

2. F = truth(f) G P2 is ultimately periodic, with period length 2^, for b G N; 

3. poly{f) = a-\- {z^'’ — l)F, for a G N. 



Proposition 8 (Combinational circuit). The following are characteristic 
equivalents, for finite causal f , with i inputs, to be memoryless.- 

1. SDD{f) contains no register; 

2. size{f) = 1; 

3. F = truth(f) G P.2 is periodic, i.e. —1 < F(2) < 0, with period length 2® , 

for some integer i' < i; 

•/ •/ 

4-. poly{f ) = a-\- (z^ — 1)F , for some integer a < 2^ . 

For Boolean functions, the SDD procedure is the same as the Binary Decision 
Diagrams BDD procedure, from [13]. 
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8 Appendix 

We use 14 operators, from Binary Algebra: 5 unary operations {-i, 0, 0“, 
and 9 binary operations {U,©,n,(8),0,+,~,x,/}. The binary operators are 
listed here in order of increasing syntactic precedence, so as to save parentheses. 

- (D,( 0 ),( 1 ),-, U, n) is a Boolean algebra; 

(D, ( 0 ), ( 1 ), ©, n) is a Boolean ring: o = ana, 0 = a©a (see [8]). 
a = z~ za, 
zz~a = a n —2, 

->z~a = z~^a, 

- ^za = 1 + z^a, 

z^{a Qb) = z^a © z^b, for © e {U, n, ©}, 
z(a Q b) = za Q zb, for © e {U, n, ®, +, -}, 
z{a Q b) = za O b = a O zb, for © e { x , ©}, 

- (D, 0 , 1 ,©,©, ©) is an Integral Domain, i.e. a commutative ring without 
divisor of 0 . An element a G D has a (polynomial) inverse 1 0 a, such that 
a 0 (1 0 a) = a, if and only if a is odd (1 = a(0)): 1 0 (1 + 06) = 0 (06)'^. 

- (D, 0 , 1 ,+, — , x) is an Integral Domain. 

-^a = —a — 1, 
a + 6 = (a U 6) + (a n 6) 

= (a © 6) + 0(0 n b), 
a + 6 = aU6 = a©6iffan6 = 0 , 

l/(l- 26 ) = E( 26 )" = n(l + (2&)0. 

t a = 0(0^) = a 0 a, 
a = it a, 

a = Tia + -zTi z~a, 

^ i a = i “'cs, 

- ^ ) a = ( 01 )U ) — 'fl, 
i 0^a = 0 i a, 

I 0a = 0^ t O') 

i (a © 6) = i a © i 6, for © e {U, n, ©}, 

T (a © &) = T a O T for © G {U, n, ®, 0, 0}. 

We list the known closure properties, for operators and sub-structures, in Binary 
Algebra. 

- F2'^ is closed, under {-1, U, ®, n, 0, 0, 0 , x , /}. 

- N is closed, under {U, ©, n, 0, 0 , t, i, 0, +, x }. 

- Z is closed, under {^, U, ©, n, 0, 0“, |, +, — , x}. 

- P 2 is closed, under {-1, U, ®, n, 0, 0“, T, i, 0 , +, x}. 

- A2 is closed under {^, U, ®, n, 0, 0“, t, i, 0 , 0 , +, — }• The closure under carry- 
free product is shown in [ 14 ]. It is shown in [ 15 ] that A2 is not closed under 
multiplication x with carries. 

- P, Z2 and Z2 are closed, under all 14 operations. 
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Abstract 

The purpose of this talk is to give an introduction to the domain of Performance 
Evaluation of Networks, its methods and its practical results. The short tour will 
begin with the classical results and finish with some of the principal challenges 
faced by the theory today. 

The concern about mathematically predicting the performance of communi- 
cation systems is not new: the beginning of the theory is traditionally associated 
with the work of A.E. Erlang (1917) on the blocking probability for telephone 
trunk lines. The family of stochastic models used by him and his followers even- 
tually led to Queuing Theory, a wealth of formulas and methods for computing 
throughputs, waiting times, occupation levels of resources and other performance 
measures. 

From the point of view of networking, one of the main achievements of this 
theory is perhaps the family of product form theorems for networks of queues, 
obtained in the 70’s. When they apply, these theorems allow reduce the analysis 
of a network to that of each of its elements in isolation. Among numerous pos- 
sibilities, the results have been applied to the design of scheduling mechanisms 
for computers, to the problem of resource allocation, in particular the optimal 
routing in the then-emerging packet switching networks, and to the design of 
window flow-control mechanisms. 

In the 80’s, new problems appeared with the evolution of networking to 
higher speeds and to the integration of the services offered by classical telecom- 
munications and computer networks. More stress was put on the necessity of an 
end-to-end “quality of service” (QoS), and “real-time” operation. In parallel, it 
was realized that the applications that were to use the networks (voice, video, 
data retrieving, distributed computing) generate a network traffic very differ- 
ent than the usual Poisson processes commonly assumed in the models. All this 
provoked the emergence of new concepts such as traffic shaping, and the equiva- 
lent bandwidth of complex sources. The importance of the scheduling policy for 
switching nodes in networks has been emphasized. Current research also tries to 
assess the importance of the long range dependence and fractal behavior of the 
traffic, which has been measured in local as well as in wide area networks. 
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Even more recently, the popularization of the Web has provoked a renewed 
interest in the analysis of the performances of the Internet, its protocols, its 
applications and its evolution. To name just some areas for research: 

Internet performance & QoS. Flow control & congestion avoidance: TCP and 
its improvement. Reliable multicast. Feedback- less communication and for- 
ward error correction. Differential service. Traffic shaping, policing, pricing. 
Network interconnection and tunelling. 

Web performance. Information transfer Protocols, HTTP I.l vs HTTP 1.2. Web 
server optimization: caching, multi-threading, mirroring. 

Voice & Video. Network-conscious & adaptive compression and transmission. 
Dimensioning of buffers, playout. Real-time vs offline handling of video on 
demand. 

The theoretical foundations of performance evaluation are currently receiving 
contribution from other fields of applied mathematics: statistics (time series 
analysis, parameter estimation), optimal control theory, game theory (fairness 
of resource sharing, individually vs socially optimal behavior). 
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Abstract. In this work, we review an important kind of knowledge pat- 
tern, emerging patterns (EPs). Emerging patterns are associated with 
two data sets, and can be used to describe significant changes between 
the two data sets. To discover all EPs embedded in high-dimension and 
large-volume databases is a challenging problem due to the number of 
candidates. We describe a special type of EP, called jumping emerging 
patterns (JEPs) and review some properties of JEP spaces (the spaces 
of jumping emerging patterns) . We describe efficient border-based algo- 
rithms to derive the boundary elements of JEP spaces. Moreover, we 
describe a new classifier, called DeEPs, which makes use of the discrim- 
inating power of emerging patterns. The experimental results show that 
the accuracy of DeEPs is much better than that of fc-nearest neighbor 
and that of C5.0. 



1 Introduction 

We are now experiencing an era of data-explosion. Great advances in software 
and hardware engineering means that data are generated and collected at a 
tremendous rate and from a very wide variety of sources: including scientific 
domains (e.g., the human genome project), government organizations (e.g., cen- 
sus projects), and business corporations (e.g., supermarket transactions). With 
the major advances in the database and storage technology, it is easy for us to 
store vast amount of data in GD-ROMs, hard disks, and magnetic tapes, forming 
mountains of data [8]. The traditional statistical techniques, in analyzing data, 
rapidly break down as the volume and dimensionality of the data increase. Now, 
the question for us is how to efficiently discover “useful knowledge” from the 
mountains of data. 

One solution to this problem is the use of Knowledge Discovery in Databases 
(KDD) techniques. Traditionally, KDD is defined as follows [8]: 

Knowledge Discovery in Databases is the non-trivial process of iden- 
tifying valid, novel, potentially useful, and ultimately understandable 
patterns in data. 

The central term in this KDD definition is “patterns”, constrained by some 
interesting properties such as validity, novelty, usefulness, and understandabil- 
ity. A crucial step in KDD processes is to identify “non-trivial” patterns. This 
important step is called Data Mining: 
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Data Mining is a step in the KDD processes and consists of particular 
algorithms that, under some acceptable computational efficiency lim- 
itations, produces a particular enumeration of the required patterns. 

In this work, we describe a new knowledge pattern, called emerging pat- 
tern (EP) [6], for KDD. Generally speaking, emerging patterns are associated 
with two data sets, and are used to describe significant changes (differences or 
trends) between these two data sets. In this work, we also propose data mining 
algorithms to efficiently discover and represent EPs. More importantly, to show 
the usefulness of the newly introduced patterns, we apply the ideas of emerging 
patterns to the problem of classification, and propose and develop accurate and 
scalable classifiers. 

In the remainder of this paper, we begin by presenting a collection of pre- 
liminary definitions. We then in Section 3 describe two examples of emerging 
patterns. We formally define emerging patterns in Section 4. These definitions 
include general EPs, jumping emerging patterns, and strong emerging patterns. 
In Section 5, we explain how the concept of emerging patterns satisfies the prop- 
erties of KDD patterns such as validity, novelty, potential usefulness, and under- 
standability. In Section 6, we formally describe the space of jumping emerging 
patterns. An important property, convexity, of JEP spaces is reviewed, and the 
border-based algorithms which are used to discover the boundary elements of 
JEP spaces are also outlined in this section. To show the usefulness of EPs, we 
in Section 7 review a new classifier, DeEPs, by describing the basic idea behind 
in it and by providing its detailed performance. We present some related work 
in Section 8. We conclude this paper with a summary. 



2 Preliminaries 

In relational databases, the most elementary term is called an attribute. An 
attribute has its domain values. The values can be discrete (including cat- 
egorical), or continuous. For example, colour can be an attribute in some 
database and the values for colour can be red, yellow, and blue. Another 
example is the attribute age which can have continuous values in the range of 
[0, 150]. We call the attribute- value pair an item. So, colour-red is an item. 
A set of items is simply called an itemset. We also define a set of items as a 
transaction or an instance. A database, or a data set, is defined as a set of 
transactions (instances). The cardinality or volume of a relational database 
V = {Ti,T 2 , • ■ • ,T„}, denoted \V\, is n, the number of instances in V, treating 
21 as a normal set. The dimension of T> is the number of attributes used in T>. 

Now, we present other basic definitions. We say that a transaction T con- 
tains an itemset X (or, X occurs in T) if A C T. 

Definition 1. Given a database V and an itemset X, the support of X in V, 
denoted suppv{X), is the percentage of transactions in V containing X. The 
count of X, denoted countx>{X), is the number of transactions in V containing 
X. 
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The subscript T> in both supp-p{X) and cowntp(X) can be omitted if T> is un- 
derstood. Observe that 

count-p{X) 
suppT,[X) = — , 

where \D\ is the number of transactions in T). 

Definition 2. Given a database T> and a real number 6 (0 < 5 < 1), an itemset 
X is defined as a large (or, frequent) itemset if suppv{X) > 5. 

Emerging patterns are intended to capture trends over time and contrasts 
between classes. Assume that we are given an ordered pair of data sets T>i and 
T> 2 - Let suppi(X) denote suppp. (A). The growth rate of an itemset X from T>i 
to T >2 is defined as 

{ 0, if suppi(X) = 0 and supp 2 {X) = 0 

oo, a suppi{X) = 0 and supp 2 {X) ^ 0 

SUpp2(X) 
suppi (A ) ’ 

This definition of growth rates is in terms of supports of itemsets. Alternatively, 
growth rates can be defined in terms of counts of itemsets. The counts-based 
growth rates are useful for directly calculating probabilities, especially for situ- 
ations where the two data sets have very unbalanced population. 

3 Two Examples of Emerging Patterns 

Example 1. Millions of EPs were discovered in the mushroom data set (available 
from the UCI Machine Learning Repository [4]) when the minimum growth rate 
threshold was set as 2.5 [11]. The following are two typical EPs consisting of 3 
items: 

X = {(Odor = none), (Gill_Size = broad), (RingJlumber = one)} 

and 

Y = {(Bruises = no), (Gill_Spacing = close), (Veil_Color = white)}. 

Their supports and growth rates are shown in Table 1. The EPs with very large 
growth rates are notable characteristics, differentiating edible and poisonous 
mushrooms. 

Example 2. About 120 EP groups were discovered from the U.S. census PUMS 
database (available from www.census.gov) [6]; some of those EPs contain up to 
13 items. They are derived from the population of Texas to that of Michigan us- 
ing the minimum growth rate threshold 1.2. A typical one is: {Disabll:2, Langl:2, 
Means:l, Mobilili:2, Perscar:2, Rlabor:!, Travtim:[1..59], Work89:l}; the items relate 
to disability, language at home, means of transport, personal care, employment 
status, travel time to work, and working or not in 1989. Such EPs can describe 
differences of population characteristics amongst distinct social groups. Clearly, 
domain experts can analyze such EPs, and select the useful ones for further 
consideration in their specific applications. 
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Table 1. The supports and growth rates of two EPs. 



EP suppjn.poisonous suppjn.edible growth_rate 
H 0% 63.9% oo 

Y 81.4% 3.8% 21.4 



4 Definitions of Emerging Patterns 

4.1 A Definition for General EPs 

Having defined growth rates, emerging patterns are defined as follows. 

Definition 3. Given p > I as a growth rate threshold, an itemset X is called 
an p-emerging pattern from T>i to T>2 if GrowthRatEp^^ P 2(^) > P- 

A p-emerging pattern is sometimes called p-EP briefly or simply EP when p 
is understood. “An EP from T>i to H2” is also sometimes stated as “An EP in 
(or of) X>2” when T>i is understood. By the support of an EP, it is meant that 
its snpport in V2 is referred to. 

Note that emerging patterns are closely associated with two data sets T>i and 
7)2 ■ In an interchangeable manner, the data set Pi will be called the background 
data set of the EPs, and T>2 the target data set, particularly for the cases where 
data are time-ordered. Eor some other cases where data are class-related (e.g., 
poisonous class and edible class in the mushroom database), the data sets T>i 
is called negative class and V2 are called positive class. 

Sample EPs were given in Section 3. For instance, the itemset Y in Example 1 
is an EP from the edible-mushrooms data set to the poisonous-mushrooms data 
set with a growth rate of 21.4. For this particular EP, the edible mushrooms 
are considered as the negative class, the poisonous mushrooms are considered as 
positive, its snpport is 81.4%. 

The growth rate of an EP measures the degree of changes in its supports, 
and it is of primary interest in our studies. The actual supports of EPs are only 
of secondary interest. 

Given a growth rate threshold p and two data sets T>i and V2, the two sup- 
ports of every emerging pattern from Vi to V2 can be described by Figure 1. 
In this snpport plane, the horizontal axis measures the support of every itemset 
in the target data set V2; the vertical axis measures the support in the back- 
ground data set T>i. So, for each emerging pattern X from T>i to V2, the point 
{supp2{X) , suppi{X)) must be enclosed by the triangle AABC. As discussed in 
[6], it is very difficult to discover all emerging patterns embedded in two dense 
high dimensional data sets. 

4.2 Definitions for More Specific EPs 

Two special types of emerging patterns are introduced in this section. The first 
type are the EPs with the growth rate of 00, specifically called jumping emerg- 
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suppi{X) 




ing patterns. The second type are those EPs satisfying the subset-closure prop- 
erty, called strong emerging patterns. Formally, 

Definition 4 . A jumping emerging pattern (JEP) from T>i to T>2 is defined as 
an emerging pattern from T>\ to T>2 with the growth rate of oo. 

The itemset X in Example 1 is a JEP of the edible-mushroom data set, since 
it occurs in this data set with a support of 63.9% but it has 0% support (it does 
not occur) in the poisonous-mushroom data set. In other words, this pattern 
dominates in edible mushrooms. However, it does not occur in any poisonous 
instances. As all jumping emerging patterns have their supports increase in the 
background data set from 0% support to a none-zero supports in the target data 
set, our intuition leads us to add the “jumping” prefix to our traditional EPs to 
name JEPs. 

For any jumping emerging pattern X oiT>2, the point (supp2{X), supp\{X)) 
must lie on the horizontal axis in Figure 1. This is simply because suppfiX) = 0. 

Emerging patterns do not always have the sub set- closure property [6]. (A 
collection C of sets is said to have the subset-closure property if and only if all 
subsets of any set X {X e C) belong to C.) Therefore, a proper subset of a 
known EP is not necessarily an emerging pattern. The notion of strong emerging 
patterns is proposed to describe the emerging patterns satisfying the subset- 
closure property. Formally, 
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Definition 5. For a given growth rate threshold p, an emerging pattern is de- 
fined to he a strong emerging pattern if all of its nonempty subsets are also EPs. 

The problem of efficiently mining strong emerging patterns is fundamentally 
similar to the problem of mining frequent itemsets [1]. Approaches to efficiently 
mining frequent itemsets include Apriori [2] and Max-Miner[3]. 

5 Emerging Patterns: A Type of KDD Pattern 

Emerging patterns satisfy the following properties. 

— Validity — The discovered patterns should be valid on new data with some 
degree of certainty: In our experiments, we found that most of the EPs 
remain EPs after the previous databases are updated by adding a small 
percentage of new data. We also found that there exist some EPs that tend 
to remain EPs, with high degree of certainty, even when a large percentage 
of new data are incorporated into the previously processed data. These facts 
imply that we can capture some nature of the systems by using the concept 
of EPs. Hence, EPs are a type of valid knowledge pattern. 

— Novelty: Sometimes, EPs are long, which might be never discovered by 
the traditional statistical methods. These astonishingly long patterns are 
providing new insights into previously “well” understood problems. 

— Potential Usefulness: Emerging patterns can describe trends in any two 
non-overlapping temporal data sets and describe significant differences in any 
two spatial data sets. Clearly, the discovered EPs can be used for predicting 
business market trends, for identifying hidden causes to some specific diseases 
among different racial groups, for hand-writing characters recognition, and 
for differentiating positive instances and negative instances (e.g., edible or 
poisonous, win or fail, healthy or sick). 

— Understandability: Emerging patterns are basically conjunctions of simple 
conditions. For example, the pattern 

{(Odor = none), (Gill_size = broad), (Ring_number = one)} 

consisting of three simple conditions, is an EP in the mushroom database 
[4]. Two facts concerning this pattern are: 

1. Given a mushroom, if its Odor is none and its Gill_size is broad and 
its Ring_number is one, then this mushroom must be edible rather than 
poisonous. 

2. About 63.9% of edible mushrooms have the above physical characteris- 
tics. But, no single poisonous mushroom satisfies those three character- 
istics. 

Clearly, this pattern gives a difference description between edible and poi- 
sonous mushrooms. It can be seen that EPs are easily understandable due 
to the lack of complexity involved in their interpretation. 

In the subsequent sections, we describe some properties of the space of jump- 
ing emerging patterns, and describe the usefulness of EPs in constructing our 
newly proposed DeEPs classifier [10]. 
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6 The Space of Jumping Emerging Patterns 

Since jumping emerging patterns are special EPs, with the specification that 
their supports in one data set are zero but non-zero in the other data set, we are 
able to use this constraint to efficiently discover all jumping emerging patterns. 

In Section 2, we have presented some basic definitions such as attributes, 
itemsets, supports, large itemsets, and instances. Here, we require further basic 
definitions for us to describe JEP spaces. These definitions mainly include spe- 
cific itemsets, general itemsets, maximal itemsets, and minimal item- 
sets. 

Definition 6. Given two itemsets I\ and I 2 , itemset I\ is more general than 
itemset I2 if h C I2; it is also said that I2 is more specific than I\. 

For example, the set {1,2} is more general than the set (1,2, 3, 4}. In other 
words, the set (1,2, 3, 4} is more specific than {1,2}. 

Definition 7. Given a collection C of itemsets, an itemset X ^ C is defined 
as maximal in C if there is no proper superset of X in C. Similarly, an itemset 
Y ^ C is defined as minimal in C if there is no proper subset ofY in C 

For example, for the collection of {{1}, {2}, {1, 2}, {1, 2, 3}, {1, 2, 4}}, a maximal 
itemset in this collection is {1,2,3} and a minimal itemset is {1}. 

For a given collection of itemsets, observe that its maximal itemsets are equiv- 
alently the most specific elements in it; its minimal itemsets are equivalently 
the most general elements in it. 

For an arbitrary set S, recall that l^j represents the cardinality (or length) 
of S, namely, the number of the elements in S. Therefore, the most specific 
elements in a collection always have the largest cardinality among the elements. 

6.1 JEP Spaces and Convexity 

Suppose we are given a set of positive and negative instances. By gathering to- 
gether all individual jumping emerging patterns and viewing them as a whole, 
this collection itself creates a new property in addition to the sharp discriminat- 
ing power held in its every elements. This collection is called a JEP space, and 
the property is called convexity. Formally, 

Definition 8. Given a set T>p of positive instances and a set T>n of negative 
instances, the JEP space with respect to T>p and T>n is defined as the set of all 
the JEPs from T>n to Vp. 

“A JEP from to Pp” is sometimes referred to as “A JEP with respect to 
Vp and . Recall that an itemset X is considered to occur in a data set V if and 
only if one or more instances in V contain this itemset, namely, supp'p(X) > 0. 
So, a JEP space can be stated as a collection in which each element only occurs 
in Vp but not in U„. 

Note that JEP space is significantly different from version space [14,15] be- 
cause of different consistency restrictions on their elements with the data. In our 
framework, version spaces are defined as follows: 
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Definition 9. Given a set T>p of positive instances and a set T>n of negative 
instances, the version space with respect to T>p and T>n is the set of all the 
itemsets whose supports in T>p are 100% and whose supports in are 0%. 

Consequently, each element in a version space must occur (or be contained) 
in every positive instance and no negative instance (under the partial order of 
set-containment). This condition is much stricter than that of JEPs. In practice, 
for example, the data sets in the UCI Machine Learning Repository (Blake & 
Murphy, 1998) always produce empty version spaces rather than those discussed 
in (Hirsh, 1994; Mitchell, 1982) which contain large, even sometimes infinite, 
number of elements. With a weaker consistency restriction, JEP space becomes 
more useful in practice. 

We now show two examples. 



Table 2. Weather conditions and Saturday Morning Sport. 



Instance Outlook Temperature Humidity Windy Saturday MomingSport 



1 


rain 


mild 


high 


false 


Yes 


2 


rain 


mild 


normal 


false 


Yes 


3 


sunny 


hot 


high 


false 


No 


4 


sunny 


hot 


high 


true 


No 



Example 3. Table 2 contains four training instances, each represented by four 
attributes: Outlook, Temperature, Humidity, and Windy. The positive 
instances are those where the weather conditions are good (Yes) for Saturday- 
MorningSport, and the negative instances are those where the weather condi- 
tions are not good (No). Eor this data set, the version space consists of the fol- 
lowing six itemsets: {rain}, {mild}, {rain, mild}, {rain, false}, {mild, false}, 
{rain, mild, false}. (The attribute names are omitted in the itemsets if they 
are understood.) Note that these itemsets occur in the positive class with 100% 
support but they do not occur in the negative class at all. 

Example f. Continuing with Table 2, the JEP space with respect to those four 
instances consists of the following itemsets: {r}, {m}, {r,m}, {r,high}, {r,f}, 
{m, high}, {m, /}, {r, m, high}, {r, m, /}, {m, high, /}, {r, high, /}, {r, m, high, 
/}. {r,n}, {m,n}, {r,m,n}, {m,n,f}, {r,n,f}, and {r,m,n,f}. (For conve- 
nience, we use the first letter of the attribute values to represent those values 
when there is no confusion.) 

Because of different consistency restrictions on the elements within the train- 
ing data, the sizes of version space and JEP space are quite different as shown 
in Example 3 and Exampl 4. For the same training data, the JEP space always 
contains the elements covered by the version space. 
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One may argue that the supports of most JEPs are too small to be useful. 
Fortunately, our border-based algorithms can efficiently derive the most gen- 
eral JEPs, namely those with the largest supports, and use them to form one 
bound of the “border” representation of JEP spaces. So, if the smaller support 
JEPs are not interesting, then the boundary JEPs can be focused. Moreover, the 
non-boundary JEPs can be easily generated from the boundary elements. The 
supports of the most general JEPs can reach 60% or even 70% in some data sets 
such as the mushroom and nursery data sets. Generally the support level of the 
most general JEPs discovered in our experiments is in the range of 10% - 20%. 

The size of JEP spaces can be large. For instance, the JEP space in Example 4 
contains 18 itemsets. In another example the JEP space of the mushroom data 
[4] contains up to 10® itemsets. To enumerate all these itemsets is prohibitively 
expensive. Interestingly, JEP spaces evince a property called convexity [9] or 
interval closure [6]. By exploiting this property, JEP spaces can be succinctly 
represented by the most general and the most specific elements among them. 

Definition 10. [9,6] A collection C of sets is said to be a convex space if, for 
all X,Y and Z, the conditions X CY C Z and X,Z imply that Y £C. 

If a collection is a convex space, we say it holds convexity or it is interval 
closed. 

Example 5. All of the sets {1}, {1,2}, {1,3}, {1,4}, {1,2,3}, and {1,2,4} form 
a convex space. The set £ of all the most general elements in this space is {{1}}; 
the set TZ of all the most specific elements in this space is {{1, 2, 3}, {1, 2, 4}}. 
All the other elements can be considered “between” £ and 72.. 



Theorem 1. Given a set Vp of positive instances and a set of negative 
instances, the JEP space with respect to T>p and T>n is a convex space. 

Using this property, JEP spaces can be represented and bounded by two sets 
like the sets £ and 72. in Example 5 — £ and 72 play the boundary role. 

With the £-and-72 representation, all the other JEPs in a JEP space can be 
generated and recognized by examining its bounds. We next formalize the two 
boundary sets by using the concept of borders. 



6.2 Using Borders to Represent JEP Spaces 

A border is a structure, consisting of two bounds. A simple example might 
be <{{«},{&}}, {{a,b,c},{b,d}}>, which represents all those sets which are 
supersets of {a} or {6} and subsets of {a,b,c} or {b,d}. Formally, 

Definition 11. [6] An ordered pair <£,72> is called a border, £ the left bound 
of this border and 72 the right bound, if (i) each one of £ and TL is an anti- 
chain — a collection of sets in which any two elements X and Y satisfy X %Y 
and Y X, (ii) each element of £ is a subset of some element in 72 and 
each element of TZ is a superset of some element in £. The collection of sets 
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represented by, or the set interval of, a border <C,TZ> consists of those itemsets 
which are supersets of some element in L but subsets of some element in TZ. This 
collection is denoted [C,TV\ = {Y \ 3X G C,3Z G TZ such that X CY C Z}. 
The collection [C, TZ\ is said to have <C, TZ> as border. 

There is a one-to-one correspondence between borders and convex spaces. 

Proposition 1. Each convex space C has a unique border <C,TZ>, where C is 
the collection of the most general sets in C and TZ is the collection of the most 
specific sets in C. 

In snmmary, it can be seen that 

— Given a border <£, TZ>, then its corresponding collection [£, TZ] is a convex 
space. 

— Given a convex space, then it can be represented by a nnique border. 



Example 6. The JEP space in Example 4 can be represented by the border 

<{{r}, {m}}, {{r, m, high, /}, {r, m, n, /}}>. 

Its left bound is {{r}, {m}} and its right bound is {{r, m, high, /}, {r, m, n, /}}. 
This border represents all the sets which are supersets of {r} or {m} and subsets 
of {r, m, high, /} or {r, m, n, /}. This border is a concise representation because 
it uses only four sets to represent 18 sets. 

We next describe two further types of convex spaces in addition to JEP 
spaces which are frequently used in data mining tasks. The two types are large 
spaces and small spaces. Given a data set V and a support threshold S, the 
collection of all the large itemsets is called a large space; the collection of all the 
itemsets whose supports are smaller than the threshold 6 is called a small space. 
According to Proposition 1, there exists a unique border for any large space or 
any small space. The left bound of any large space is {0}, and the right bound 
is the set of the maximal large itemsets. The Max-Miner algorithm [3] can be 
used to discover the maximal large itemsets with respect to a support threshold 
in a data set. 

In the following subsection, we describe horizontal spaces, a special type 
of large space. Horizontal spaces are useful for us in rewriting and computing 
JEP spaces. 

6.3 Using Horizontal Spaces to Rewrite and Compute JEP Spaces 

Given a data set V, all non-zero support itemsets X, namely, suppT>{X) 7 ^ 0, 
form a convex space. This is mainly due to the fact that any subset of a non-zero 
support itemset has a non-zero support. This convex space is specifically called 
a horizontal space. Horizontal spaces can be used to exclude those itemsets 
Y which do not occur in T>, namely supp'p{Y) = 0. As each horizontal space is 
convex, it can be represented by a border. This border is specifically called a 
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horizontal border. The left bound of this border is {0} and the right bound 
TZ is the set of the most specific non-zero support itemsets. The right bound TZ 
can be viewed as a horizontal line which separates all non-zero support itemsets 
from those zero support itemsets. 

In our framework, the most specific non-zero support itemsets are those in- 
stances in a data set V, assuming there are no duplicate instances in this data 
set. This is due to the fact that all instances have the same cardinality and 
there are no duplicates in the data, and thus the data set itself is an anti-chain 
collection. 

We can use horizontal spaces for computing JEP spaces. 

Proposition 2. Given a set Vp of positive instances and a set T>n of negative 
instances, then the JEP space with respect to Vp and V„ is 

[{0},7^p]-[{0},7^„] 

where, [{%},TZp\ is the horizontal space of Vp and [{0},7^„] is the horizontal 
space ofVn- 

Proof. By definition, all elements of a JEP space must occur in the positive 
data set but not in the negative data set. So, subtracting all non-zero support 
itemsets in from all non-zero support itemsets in Vp produces all the JEPs. 

Therefore, the JEP space with respect to Vp and can be “represented” 
by the two horizontal borders of the two data sets. The border of this JEP space 
can be efficiently derived by using border-based algorithms (Dong & Li, 1999). 
This idea also constructs a foundation for maintaining JEP spaces efficiently 
[ 12 ]. 

6.4 Border-based Algorithms for Efficiently Discovering JEPs 

We need three algorithms Horizon-Miner, Border-Diff, and jepProducer 
to efficiently discover the border of the JEP space with respect to a set Vp 
of positive instances and a set of negative instances. The first algorithm 
Horizon-Miner is used to derive the horizontal border of Vp or of With 
these two discovered horizontal borders as arguments, jepProducer outputs 
a border as a concise representation of the JEP space. Border-Diff is a core 
subroutine in jepProducer. 

The details of these algorithms are omitted here. Readers are referred to [6] , 
[11], and [12] for further details. 

7 Instance-based Classification by EPs 

An important characteristic of emerging patterns is their strong usefulness. In 
this section, we describe a new instance-based classifier, called DeEPs (short 
for Decision making by Emerging Patterns), to show the usefulness of emerging 
patterns in solving classification problems. 
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The DeEPs classifier has considerable advantages with regard to accuracy, 
overall speed, and dimensional scalability over other EP-based classifiers such 
as CAEP [7] and the JEP-Classifier [11], because of its efficient new methods 
of selecting sharp and relevant EPs, its new ways of aggregating the discrimi- 
nating power of individual EPs, and most importantly its use of instance-based 
approach which creates a remarkable reduction on both the volume and the 
dimension of the training data. 



7.1 Overview of DeEPs 

Given two classes of data 2?i and V 2 and a testing instance T, the central 
idea underpinning DeEPs is to discover those subsets of T which are emerging 
patterns between T>i and V 2 , and then use the supports of the discovered EPs 
for prediction. We use the following example to illustrate DeEPs. 

Example 7. Table 3 [16] contains a training set, for predicting whether the 
weather is good for some “Saturday morning” activity. The instances, each de- 
scribed by four attributes, are divided into two classes: class V and class M. 



Table 3. Weather conditions and Saturday Morning activity. 



Class V (suitable for activity) 


Class N (not suitable) 


outlook temperature 


humidity windy 


outlook temperature humidity windy 


overcast hot 


high 


false 


sunny 


hot 


high 


false 


rain mild 


high 


false 


sunny 


hot 


high 


true 


rain cool 


normal 


false 


rain 


cool 


normal 


true 


overcast cool 


normal 


true 


sunny 


mild 


high 


false 


sunny cool 


normal 


false 


rain 


mild 


high 


true 


rain mild 


normal 


false 










sunny mild 


normal 


true 










overcast mild 


high 


true 










overcast hot 


normal 


false 











Now, given the testing instance T={sunny, mild, high, true}, which class 
label should it take? Initially, DeEPs calculates the supports (in both classes) 
of the proper subsets of T in its first step. The proper subsets of T and their 
supports are organized as the following three groups: 

1. those that only occur in Class Af but not in Class V: 



Subset of T 



Support in Class V {suppv) suppu 



{sunny, high} . 
f sunny, mild; high i 
{sunny,high,truej 



ZU7o 

20 % 
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2. those that only occur in Class V but not in Class TV": 



Subset of T 


Isuppv SUppU 


{ sunny, mild, true} 


1 11% 


0 



3. those that occur in both classes: 



Subset of T 


\suppv SUppM - 


0 


100% 


100% - 


{mild} 


44% 


40% 


{sunny} 


22% 


60% 


{high} 


33% 


80% 


{true} 


33% 


60% 


{sunny, mild} 


11% 


20% : 



- Subset of T 


\suppv SUppM 


{mild, high} 


22% 


40% 


{sunny, true} 


11% 


20% 


{high,true} 


11% 


40% 


{mild, true} 


11% 


20% 


{mild, high, true} 


11% 


20% 



Obviously, the first group of subsets — which are indeed EPs of Class N as 
they do not appear in Class V at all — favours the prediction that T should be 
classified as Class TV". However, the second group of subsets gives us a contrasting 
indication that T should be classified as Class V, although this indication is not 
as strong as that of the first group. The third group also strongly suggests that 
we should favour Class Af as T’s label, although the pattern {mild} contradicts 
this mildly. Using these EPs in a collective way, not separately, DeEPs would 
decide that T’s label is Class TV" since the “aggregation” of EPs occurring in 
Class Af is much stronger than that in Class V. 

In practice, an instance may contain many (e.g., 100 or more) attributes. To 
examine all subsets and discover the relevant EPs contained in such instances by 
naive enumeration is too expensive (e.g., checking 2^*^° or more sets). We make 
DeEPs efficient and scalable for high dimensional data by the following data 
reduction and concise data-representation techniques. 

— We reduce the training data sets firstly by removing those items that do 
not occur in the testing instance and then by selecting the maximal ones 
from the processed training instances. This data reduction process makes 
the training data sparser in both horizontal and vertical directions. 

— We use borders, two-bound structures like <C,TZ>, to succinctly represent 
all EPs contained in a testing instance. 

— We select boundary EPs (typically small in number, e.g., 81 in mushroom) 
for DeEPs’ decision making. 

Detailed discussions and illustrations of these points are given in [10]. Ta- 
ble 4 illustrates the first stage of the sparsifying effect on both the volume and 
dimension of T>-p and T>^, by removing all items that do not occur in T. Observe 
that the transformed V-p and T>^ are sparse, whereas the original T>p and T>j\f 
are dense since there is a value for each attribute of each instance. 

Eor continuous attributes, we describe a new method, called neighborhood- 
based intersection. This allows DeEPs to determine which continuous at- 
tribute values are relevant to a given testing instance, without the need to pre- 
discretize data. More details can be found in [6]. 
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Table 4. Reduced training data after removing items irrelevant to the instance 
{sunny, mild, high, true}. A indicates that an item is discarded. There are 
only two maximal itemsets in the Reduced Class V (namely {sunny, mild, true} 
and [mild, high, true}) and only 3 maximal itemsets in the Reduced Class A/". 



Reduced Class V 


Reduced Class Af 


outlook temperature 


humidity windy 


outlook temperature humidity windy 


* 


* 


high 


* 


sunny 


* 


high 


* 


* 


mild 


high 


* 


sunny 


* 


high 


true 


* 


* 


* 


* 


* 


* 


* 


true 


* 


* 


* 


true 


sunny 


mild 


high 


* 


sunny 


* 


* 


* 


* 


mild 


high 


true 


* 


mild 


* 


* 










sunny 


mild 


* 


true 










* 


mild 


high 


true 










* 


* 


* 


* 











7.2 Performance Evaluation: Accuracy, Speed, and Scalability 

We now present experimental results to mainly demonstrate the accuracy of 
DeEPs. We have run DeEPs on 40 data sets which are taken from the UCI 
Machine Learning Repository [4] . The accuracy results were obtained using the 
stratified ten-fold cross-validation methodology (CV-10), where each fold has the 
same class distribution as the original data. These experiments were carried out 
on a 500MHz Pentiumlll PC, with 512M bytes of RAM. 

We first compare DeEPs with fc-nearest neighbor (fc-NN) [5] as it is also an 
instance-based classifier and it has received extensive research since its concep- 
tion in 1951. As traditionally, we set k as 3. We also compare DeEPs with C5.0 
[Release 1.12], a commercial version of C4.5 [17]. We report our experimental 
results in Table 5. The testing accuracies of DeEPs, fc-nearest neighbor, and 
C5.0 are achieved under the condition that they use the same data for their 
training and the same data for their testing. We list the names of the data sets 
in Column 1 of Table 5, and their properties in Column 2. Columns 3 and 4 
show the CV-10 average accuracy of DeEPs, when the neighborhood factor a 
is fixed as 12 for all data sets, and respectively when a is dynamically selected 
within each data set. (see [10] for how to select a.) We list the accuracies of 
fc-nearest neighbor and C5.0 respectively in Column 5 and Column 6. Column 7 
and Column 8 respectively show the average time used by DeEPs and fc-nearest 
neighbor to test one instance. 



Firstly, for the mushroom data set, DeEPs, fc-NN, and C5.0 can all achieve 
100% testing accuracy. For the remaining data sets, we highlight some interesting 
points. 
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1. DeEPs versus fc-NN. 

— Both DeEPs and fc-NN perform equally accurately on soybean-small 
(100%) and on iris (96%). 

— DeEPs wins on 26 data sets; fc-NN wins on 11 data sets. We conclude 
that the accuracy of DeEPs is generally better than that of fc-NN. 

— The speed of DeEPs is about 1.5 times slower than fc-NN does. The main 
reason is that DeEPs needs to conduct border operations. 

2. DeEPs versus C5.0. 

— DeEPs wins on 25 data sets; C5.0 wins on 14 data sets. Therefore, DeEPs 
is generally much more accurate than C5.0. 

— DeEPs is much slower than C5.0. However, DeEPs takes an instance- 
based learning strategy. 

3. DeEPs, fc-NN, and C5.0. 

— DeEPs wins on 20 data sets; fc-NN wins on 7 data sets; C5.0 wins on 14 
data sets. 

An important conclusion we can reach here is that DeEPs is an accurate 
instance-based classifier. Its accuracy is generally better than that of the state- 
of-the-art classifiers such as C4.5 and fc-nearest neighbor. However, the speed of 
DeEPs requires great improvement to compete well with other classifiers. This 
problem constitutes our future research topic. 

The primary metric for evaluating classifier performance is classification ac- 
curacy. We have already shown DeEPs is an accurate classifier and it is generally 
better than the other classifiers. Our experimental results have also shown that 
the decision speed of DeEPs is fast, and it has a noticeable scalability over the 
number of training instances. 



8 Related Works 

In previous works, we proposed two eager learning classifiers by making use of 
the discriminating power of emerging patterns. One is called CAEP [7], and the 
other the JEP-Classifier [11]. Both of these have a learning phase which is used 
to generate a collection of emerging patterns, and a testing phase. Usually the 
size of the collection of the discovered EPs is large, and a large proportion of 
these EPs are never used to test any instances. The learning phase usually takes 
a significant amount of time. On the other hand, the testing phase is very fast. 

How to maintain a system in response to minor data changes is an impor- 
tant problem. The problem of incrementally maintaining JEP spaces has been 
well solved in [12]. The proposed algorithms can handle a wide range of data 
changes including insertion of new instances, deletion of instances, insertion of 
new attributes, and deletion of attributes. 

Top rules are the association rules which have the 100% confidence. In [13], 
we proposed a technique, which is fundamentally different from the traditional 
approaches, to discover all top rules. The top rules are concisely represented by 
means of JEP spaces regardless of their supports. The advantage of our method 
is that it can discover very low support top rules. 
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9 Conclusions 

In this paper, we have reviewed the concept of emerging patterns. Emerging pat- 
terns can capture emerging trends in business data and sharp contrasts in classes. 
Emerging patterns satisfy the properties of validity, novelty, potential usefulness, 
and understandability. We have described a special type of EP, jumping emerg- 
ing patterns. JEP spaces have been shown to have the property of convexity. 
Based on this property, JEP spaces can be concisely represented by boundary 
elements which can be efficiently derived by our border-based algorithms. We 
have described an EP-based classiher, DeEPs. The DeEPs classifier takes an 
instance-based learning strategy. The reported experimental results have shown 
that its accuracy is better than that of fc-nearest neighbor and C5.0. 
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Table 5 . Accuracy of DeEPs in comparison to those of fc-nearest neighbor and 
C5.0. 



Data sets 


inst, attri 
classes 


DeEPs 

Q = 12 dynamical a 


k-NN 
'k = 3 


C5.0 


time (s.) 
DeEPs 


time (s.) 
fc-NN 


australia 


690, 14, 2 


84.78 


88.41* (5) 


66.69 


85.94 


0.054 


0.036 


breast-w 


699, 10, 2 


96.42 


96.42 (12) 


96.85 


95.43 


0.055 


0.036 


census 


30162, 16, 2 


85.93 


85.93* (12) 


75.12 


85.80 


2.081 


1.441 


chess 


3196, 36, 2 


97.81 


97.81 


96.75 


99.45 


0.472 


0.145 


deve 


303, 13, 2 


81.17 


84.21* (15) 


62.64 


77.16 


0.032 


0.019 


diabete 


768, 8, 2 


76.82 


76.82* (12) 


69.14 


73.03 


0.051 


0.039 


flare 


1066,10,2 


83.50 


83.50 


81.62 


82.74 


0.028 


0.016 


german 


1000, 20, 2 


74.40 


74.40 (12) 


63.1 


71.3 


0.207 


0.061 


heart 


270, 13, 2 


81.11 


82.22 (15) 


64.07 


77.06 


0.025 


0.013 


hepatitis 


155, 19, 2 


81.18 


82.52 (11) 


70.29 


74.70 


0.018 


0.011 


letter-r 


20000, 16, 26 93.60 


93.60* (12) 


95.58 


88.06 


3.267 


1.730 


lymph 


148, 18, 4 


75.42 


75.42 (10) 


74.79 


74.86 


0.019 


0.010 


pima 


768, 8, 2 


76.82 


77.08* (14) 


69.14 


73.03 


0.051 


0.038 


satimage 


6435, 36, 6 


88.47 


88.47* (12) 


91.11 


86.74 


2.821 


1.259 


segment 


2310, 19, 7 


94.98 


95.97* (5) 


95.58 


97.28 


0.382 


0.365 


shuttle-s 


5800, 9, 7 


97.02 


99.62* (1) 


99.54 


99.65 


0.438 


0.295 


splice 


3175, 60, 3 


69.71 


69.71 


70.03 


94.20 


0.893 


0.248 


vehicle 


846, 18, 4 


70.95 


74.56* (15) 


65.25 


73.68 


0.134 


0.089 


voting 


433, 16, 2 


95.17 


95.17 


92.42 


97.00 


0.025 


0.012 


waveform 


5000, 21, 3 


84.36 


84.36* (12) 


80.86 


76.5 


2.522 


0.654 


yeast 


1484, 8, 10 


59.78 


60.24* (10) 


54.39 


56.14 


0.096 


0.075 


anneal 


998, 38, 6 


94.41 


95.01 (6) 


89.70 


93.59 


0.122 


0.084 


auto 


205, 25, 7 


67.65 


72.68 (3.5) 


40.86 


83.18 


0.045 


0.032 


crx 


690, 15, 2 


84.18 


88.11* (3.5) 


66.64 


83.91 


0.055 


0.038 


glass 


214, 9, 7 


58.49 


67.39 (10) 


67.70 


70.01 


0.021 


0.017 


horse 


368, 28, 2 


84.21 


85.31* (3.5) 


66.31 


84.81 


0.052 


0.024 


hypo 


3163, 25, 2 


97.19 


98.26 (5) 


98.26 


99.32 


0.275 


0.186 


ionosph 


351, 34, 2 


86.23 


91.24 (5) 


83.96 


91.92 


0.147 


0.100 


iris 


150, 4, 3 


96.00 


96.67* (10) 


96.00 


94.00 


0.007 


0.006 


labor 


57, 16, 2 


87.67 


87.67* (10) 


93.00 


83.99 


0.009 


0.008 


mushroom 8124, 22, 2 


100.0 


100.0 


100.0 


100.0 


0.436 


0.257 


nursery 


12960, 8, 5 


99.04 


99.04 


98.37 


97.06 


0.290 


0.212 


pendigits 


10992, 16, 10 98.21 


98.44 (18) 


99.35 


96.67 


1.912 


0.981 


sick 


4744, 29, 2 


94.03 


96.63 (5) 


93.00 


98.78 


0.284 


0.189 


sonar 


208, 60, 2 


84.16 


86.97* (11) 


82.69 


70.20 


0.193 


0.114 


soybean-s 


47, 34, 4 


100.0 


100.0* (10) 


100.0 


98.00 


0.022 


0.017 


soybean-1 


683, 35, 19 


90.08 


90.08 


91.52 


92.96 


0.072 


0.051 


t-t-t 


958, 9, 2 


99.06 


99.06 


98.65 


86.01 


0.032 


0.013 


wine 


178, 13, 3 


95.58 


96.08* (11) 


72.94 


93.35 


0.028 


0.019 


zoo 


101, 16, 7 


97.19 


97.19 


93.93 


91.26 


0.007 


0.005 
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Abstract. The twenty five year- old Internet Protocol (IP) or known as IPv4, 
has its own history in connecting the world for information exchange. Its' new 
successor, IPv6 with promising functionality and capability is being designed to 
replace the predecessor. This paper analyses the performance of the new IP 
compared to old IP on dual stack implementation of KAME [7] FreeBSD [9] 
using ping and FTP application. Packet transmission time as been taken as the 
measurement metric. Test results of experiment shows that IPv6 performance is 
inferior to IPv4 and does not conform with the theoretical results. 



1 Introduction 

The twenty-five year- old Internet Protocol (IP) or known as IPv4, has its own history 
in connecting the world for information exchange. However the need for larger ad- 
dress space, simple routing capability, scalability, security, easier network configura- 
tion and management, QoS (Quality of Service), mobility and multicasting forced 
IETF to design it's successor, IPv6 [1,2]. Eventhough work on design and implemen- 
tation of IPv6 has started since year 1992, it has not been widely aecepted by the 
world community. It is expected that by year 2008, for with current rate of IP address 
alloeation, IPv4 will see address depletion [3]. So, transition from IPv4 to IPv6 is 
inevitable even address shortage problem can be minimized by adopting temporary 
techniques such as Network Address Translation (NAT), IP Masquerade and Dynamie 
Host Configuration Protocol (DHCP). 

However the transition from existing IPv4 to new IPv6 will be a gradual proeess as 
planed by the IETF IPng Work Group. This step is necessary to guarantee that the 
transition proeess does not interrupt the entire Internet activities. Besides, it also en- 
sures financial and time flexibility for organization to upgrade and replaee the relevant 
applications and equipment. Existence of IPv4 entities in the IPv6 Internet or IPv6 
entities in IPv4 Internet will a common phenomena for at-least few years before eom- 
plete replacement of IPv4 takes place. Dual staek implementation, whieh supports 
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both version of IP, will be used together with native IPv4 and native IPv6 nodes in the 
Internet during this period. 

IPv6 generally has been accepted as the next generation protocol, which solves 
many of today networks' constraints and problems with improvement in performance 
[4]. It is important to examine the performanee of the promising IPv6 as native and 
also as dual stack host. The papers' objeetive is to analyze and study the performance 
of IPv6 using ping and FTP applications on FreeBSD implementation. 



2 IPv6 Features 



Version 


Traffic Class 


Flow Label 


Payload Length 


Next Header 


Hop Limit 


Source Address 


Destination Address 



Fig. 1. IPv6 Header 

The new design of the IPv6 header format [2] with fewer fields eompared to IPv4 is 
aimed to increase the speed at which the packets travel through the router. Unneces- 
sary fields which are not examined by the router along the path but, required by the 
sending and reeeiving nodes are plaeed in between the IPv6 header and transport layer 
header. Besides simplifying the header this reduees router's computational workload 
and speeds up the packet delivery. Additional optional header are also easier to be 
added, making IPv6 more flexible than IPv4. Since the IPv6 header has fixed length, 
processing is also simplified. 

IPv6 does not fragment packets as they are routed unlike IPv4. Transmission's paths 
Maximum Transmission Unit (MTU) will be discovered before packets being trans- 
ported on different network layers. This means packet fragmentation and reassembly 
are done exclusively in the communicating hosts, reducing the workload of the 
routers. It is expected, by completion of IPv6 deployment over the Internet, Internet 
will have networks with MTU not smaller than 576 bytes. 

The present IP requires checksum calculation within IP packets, and therefore their 
computation at each routing step. This computation be done for every packet and total 
expenditure of involved computing resources and checksumming in the Internet which 
carries trillions of packet is significance. In IPv6 the checksum operation has been 
eliminated and done by other layers in ensuring accurate packet delivery. 

The use of flow labels in IPv6 will further optimize IPv6 performance. The flow 
source specification in the Flow Label field will have labels of any special service 
requirements from the router along the path, such as priority, delay or bandwidth. All 
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the packets of that particular flow will carry same information in the field, enabling 
request for a service type in the intermediate routes, thereby minimizing the necessary 
computation to deliver each packet. Multicast function in IPv6, which replaces 
broadcast, allows nodes to discover the participating group of a group communication 
via neighbor discovery and Internet Control Message Protocol (ICMPv6) messages 
reduces uimecessary packet examination by routers. All these features were not possi- 
ble in IPv4 Internet. 

IPv6 address assignment is more efficient and conforms to the hierarchical structure 
of the Internet compared to IPv4. Concept of CIDR (Classless Inter Domain Routing) 
has been adopted in formulating the addressing structure, which allows address aggre- 
gation with the provider based address format. Addresses had two main portions, 
which are routing and interfaee identifiers. The routing identifiers have been further 
divided to subfields, denoting registries, providers, sub-subscribers and subnets. Route 
aggregation will limit the router table explosion problem and makes routing even 
simpler. It is an exciting exercise to evaluate and analyze the multi-feature Internet 
Protocol for it's performance with above mentioned criterions. 

3 Challenges 

Eventhough IPv6 promises many advanced features in the transition period, it will 
face many ehallenges, which will effect the overall performance. Dual protocol stacks 
configuration will be a common in all exchange points, at least for few years. Before 
switching completely from native IPv4 to IPv6, clients, servers and routers will be 
running both IPv4 and IPv6 protocols. This dual stack mechanism will compromise 
IPv6 performance. Figure 2 depicts the general protocol layers for dual IP. 




< 



IPv4 packet flow 



◄ ► ◄ ► 

IPv4/6 over IPv6/4 packet IPv6 packet flow 



Fig. 2. General Protocol Layers for Dual IP Stacks. 
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Besides that IETF plans to adopt tunneling as another transition technique to mi- 
grate to IPv6. This technique allows native IPv6 notes or islands to communicate in 
IPv4 ocean, where all the interconnecting nodes are running IPv4. The vice-versa 
communication is also possible where IPv4 packets tunneled through IPv6 infrastruc- 
ture. Communicating nodes packet will be encapsulated by the packet carrying IP as 
their payload. On the other end of communication receiver will decapsulate the packet 
before processing the actual packet. Figure 3 shows the encapsulation and decapsula- 
tion process for both IPs. Most of the inter-network connections in IPv6 test-bed, 
6bone are connected via tunneling technique. Tunneling will cause overhead since 
header of either IP will reduces the actual payload of the packet. 







Source : 










Destination. 
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Destination 
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)( 



IPv4 or 
IPv6 Header 



Original IPv6 / IPv4/ IPv6 Encapsu- 
IPv4 Paeket lated 

IPv6/ IPv4 Packet 



IPv4/ IPv6 Decapsu- 
lated 

IPv6/ IPv4 Packet 



Fig. 3. Encapsulation and Decapsulation IPv4 and IPv6 packtes. 



Besides the above-mentioned factors, inter-operability incompatibility between 
multiple vendors' equipment will cause additional problem on the performance and 
delays the process of IPv6 deployment over the Internet. Independent bodies such as 
TAFll [12] are carried out conformance tests to ensure the IPv6 nodes are compatible. 

Translation, which covers Network Address Translation & Port Translation, Proto- 
col Translation and Application Proxies is another method that has been adopted by 
IETF to allow inter-protocol communication. IP translation [5] for lPv4/IPv6 and 
ICMPv4/ICMPv6 has been carried out and the result is encouraging. In this technique 
IP header fields are either directly copied, translated or eliminated to suite the version 
of the IP. Flowever information loss is unavoidable and increase of header length also 
experienced during the header translation [5]. 

In this paper we will discuss the performance of IPv6 on Ethernet network layer for 
KAME [7] FreeBSD [9] dual stack implementation as in the following section. Previ- 
ous work [6] on this issue was not concluded with concrete results due to dynamism in 
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implementation of the KAME FreeBSD dual stack. In this paper a stable version of 
IPv6 stack implementation has been adopted for performance analysis. 



4 Experiment 

We adopt the dual IP stack, FreeBSD with KAME IPv6 patch, [8] the famous, stable 
and up-to-date IPv6 implementation. Tests were conducted using KAME FreeBSD 
FTP and Ping applications that runs on reliable TCP layer and ICMP messages. FTP 
measurement was done for only data connection [6]. Two sets of network configura- 
tion, namely On-Link and Off-Link have been set-up to conduct the experiments. 
Off-link configuration allows data transfer between workstations without the presence 
of router. While on-link tests have routers in-between the workstations. 

The on-link test is significance to test the impact of checksum operation removal in 
IPv6 compared to IPv4 at routers. Besides that, effect of simplified header fields to the 
data transfer also can be analyzed by the on-link tests. The above are the two specific 
features that we would like to investigate in the experiment configuration. As shown in 
Figure 2, FTP and Ping file/message transfer will take place between the two test 
workstations. The packets will be transferred according to the configuration of layers 
that shown in Figure 2. The only difference between TCP or UDP implementation in 
KAME [8] for IPv6 and IPv4 is the address length, which is 128bits compared to 32 
bits in IPv4. We assume that the implementation of the stack is identical between IPv4 
and IPv6 for KAME in FreeBSD. The only major difference between them is IPv6 [2] 
implementation. 

Ping and FTP tests were carried out in our experiment for both on-link and off-link 
configuration for IPv4 and IPv6. TAHI group upon our request conducted the off-link 
FTP test 1 and we conducted the other two tests. Ping test for off-link was not con- 
ducted. The ping and FTP (including server and client)[6] applications for IPv4 that 
used in our experiment were from FreeBSD, as for IPv6, from KAME patch for 
FreeBSD. 



Table 1. List of tests in the experiment 



IP 


Ping (On link) 


FTP (On link and 
Off link) 


IPv4 


FreeBSD 


FreeBSD 


IPv6 


FreeBSD/ 
KAME patch 


FreeBSD/ 
KAME patch 
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4.1 Off-Link Configuration 




Workstation A Workstation B 



Fig. 4. Off-Link FreeBSD with Kame patch dual Stack network configuration 



Figure 4 shows two workstations have been interconnected via Ethernet LAN di- 
rectly and attached to the same segment. The link speed is 100 Mbps and both work- 
stations run on Pentium II processor at 350 MFlz with 64MB memory. Ping applica- 
tion for both versions of IP has been tested for the above setup. As for FTP test, TAHI 
group has been contacted for confirmation of the performance problem and they have 
carried out the experiment [12]L 



4.2 On-Link Configuration 



Router A Router B 




Workstation X Workstation Y 



Fig. 5. On-Link FreeBSD with Kame patch dual Stack network configuration for workstations 
and also routers 



Figure 5 shows, the two workstations have been intercormected via two FreeBSD 
routers with KAME patch. All the routers and workstations for above setup run on 
FreeBSD and KAME patch for IPv6. The workstation configuration is attached in 
Appendix A. 



' The FTP experiment has been conducted by TAHI group upon the author's request since they 
perform conformance and interoperability test for all IPv6 products and implementation. Ex- 
periment results viewable at http://www.tahi.org. 
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5 Results and Analysis 

The present implementation of KAME IPv6 dual stack implementation has modifica- 
tion to the TCP layer, where the address size has been changed Irom 32 bit to 128 bits 
and checksum calculation has been disabled at IP layer. Besides that, simplified IP 
header and no further fragmentation of packets done in IPv6 since the Ethernet layer 
with MTU 1500 bytes. We would like to investigate the impact of IPv6 characteristics 
on application layer based on these above factors. 

Table 2. Ethernet packet size breakdown 



Ethernet MTU (ineluding header) 


1514 bytes 


Ethernet header 




14 bytes 


Ethernet Payload 




1500 bytes 




IPv4 (bytes) 


IPv6 (bytes) 


IP Header 


20 


40 


IP Payload 


1480 


1460 


TCP Header 


20 


20 


TCP Payload 


1460 


1440 


Options 


xxxx 


XXXX 


Actual Payload 


yyyy 


yyyy 



Table 2 shows the packet composition which, is carried across the network on the 
Ethernet medium. The actual TCP payload for IPv4 is 1460 and IPv6 is 1440. It is 
expected with reduction of TCP payload in IPv6 as 20 bytes or 1.37 %, the perform- 
ance should also degrade by 1.37 %. Richard Draves and el. [10] reported 2 % of 
performance degradation on their implementation of IPv6 for NT system, which is 
tolarable. However another performance evaluation [11] for MBONE multicast tools 
on IPv6 shows the performance reduction is more than expected. 

5.1 Off Link Test 

The test shows average 4-5% deterioration in overall performance for the ping test 
between two off-link workstation. Figure 6 shows the measured result. 



Off Link Ping Test 




PacketSize <Bytes) 



Fig. 6. Off-link Ping test result 
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Off-link FTP test [12] was conducted by TAHI group. They setup four different 
environment, namely native IPv4, only IPv4 on dual stack, IPv4 on dual stack with 
IPv6 and IPv6 on dual stack with IPv4 for the test [12], The transmission time is 
shown in Figure 7^. FTP tests shows similar performance deterioration for IPv6 com- 
pared to IPv4 as the file size gets larger. 




Fig. Transmission time for Off-link FTP test results (please refer Appendix B for better 
figure) 



5.2 On-link Test 

The on-link test results show similar behavior of IPv6 performance as the off-link test. 
The presents of the router does not influence the time taken to transfer the files across 
the network. Transmission of IPv6 encapsulated IPv4 has degraded performance, as 
IPv6. While IPv4 encapsulated IPv6 packets does not show much difference com- 
pared to IPv4 since the transport mechanism is still IPv4. 



^ The transmission diagram has been extracted from tahi project homepage with their permis- 
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Data Transfer Time for Different Size of Files over Different Network 



Layers 



70 . 00 

60 . 00 



a 

8 40 . 00 - 
“ 30.00 

H= 20.00 
10 




0.00-Ui-, 






File Size (bytes) 



4 ^ 



#■ 



Fig. 8. On-link FTP test result 



6 Conclusion and Future Work 

IPv6 has degraded performance compared to IPv4 for the files that has been trans- 
ferred and also for the ICMP packets in Ping test. Eventhough the expected perform- 
ance deterioration was ~I.5%, but in actual implementation it is more than 5% in 
average. Besides that for file transfers, as the file size increases the performance of 
IPv6 further deteriorates as shown in Figure 7 and Figure 8. 

The results show IPv6 performance is not better than IPv4, instead worst than IPv4 
and expected result. Further analysis of the codes is necessary to clarify the real con- 
straint of IPv6 performance and fine tuning need to be done. 

The performance analysis is ongoing activity in our lab. There are some suspected 
reasons for the performance degradation in IPv6. In the test we assume that FreeBSD 
for IPv4 and IPv6 KAME patch for FreeBSD is similar except the IPv6 implementa- 
tion. Besides that TCP and lCMPv6 implementation has been assumed to work well. 
Currently we are conducting tests to verify all the implementations that may effect the 
test results with help of the KAME group. Code analysis for the dual stack is neces- 
sary to find out the reasons for the deterioration and also for further enhancement. In 
our future paper we expect to justify the performance degradation with improvements 
for it. 

IPv6 promises many advance features, therefore proper implementation is neces- 
sary to this new protocol. During the transition period users should be convinced that 
this migration is towards a better IP. Until the stable and fine-tunned version of IP is 
available, users will be skeptical and resist migrating to the newer version of IP. 
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Abstract. To support call locality (CL) in wireless networks, we locate 
the cache in local signaling transfer point (LSTP) level. The terminal 
registration area (RA) crossings within the LSTP area do not generate 
the home location register (HLR) traffic. The RAs in LSTP area are 
grouped statically. It is to remove the signaling overhead and to miti- 
gate the RSTP bottleneck in dynamic grouping. The idea behind LSTP 
level caching is to decrease the inconsistency ratio (ICR) with static RA 
grouping. Our scheme solves the HLR bottleneck problem due to the 
terminal’ frequent RA crossings. 



1 Introduction 

The mobility management schemes are based on Interim Standard-41 (IS-41) 
and Global System for Mobile Communication (GSM) standard. Those schemes 
use the two level hierarchies composed of home location register (HLR) and vis- 
itor location register (VLR) [2,9]. Whenever a terminal crosses a RA or a call 
originate, HLR should be updated or queried. Frequent DB accesses and message 
transfers may cause the HLR bottleneck and degrade the system performance. 
To access the DBs and transmit the signaling messages frequently cause the HLR 
bottleneck and load the wireless network [1]. There are several schemes to solve 
the problems in standard scheme [5, 6, 7, 8]. 

In Local Anchor (LA) scheme, it is method that it permits one of VLRs to local 
anchor and replaces a HLR to VLR rolling of local anchor in location registra- 
tion. This has the merits of reducing registration cost but location-tracking cost 
increases as through local anchor on location tracking. 

In Forwarding Pointer (FP) scheme, whenever RA is changed, it does not register 
to HLR but proceeds by creating the forwarding pointer from the previous VLR 
of RA. Thus location registration does not access into HLR and the cost reduces 
for registration. However, in location request, the cost can increases because of 
finding location of terminal depending on step of forwarding pointer. 

To support the GL, our scheme locates the cache in local signaling transfer point 
(LSTP) level and groups the RAs of which their serving mobile switching centers 
(MSCs) are connected to LSTP, which is depicted in Fig. I. 
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In wireless network environments, one of the distinct characteristics of call 
patterns is the CL. The CL is related to the regional distribution with which the 
caller’s request calls to a given callee. The concept is conspicuous when the callee 
is moving out his home area temporally. The call pattern has importance in the 
mobility management scheme. Using the conceptual call pattern, we estimate 
the region from which a lot of calls to a given user originate. We define the 
regional scope as the degree of call locality. The degree of CL for each user is 
different but we can limit the scope to the regional area approximately. The 
degree is said to be high when a lot of calls is generated from a few RAs to a 
given terminal. Generally, it is reasonable to assume the degree to be low enough 
to the extent that the domain of calling region covers the callee’s working area 
and its neighboring RAs. 

Applying the CL to tracking a call in an efficient manner, we can decrease 
the signaling traffic load greatly. It implies that the HLR query traffic caused 
whenever a call originates is reduced to some extent by applying the cache. 
The key idea behind applying the cache is to make the cached information be 
referred frequently and maintain the consistency to some extent. The consistency 
problem is issued in improving the performance. 

Our scheme is to present the improved caching scheme to support the CL and 
to quantify its benefits compared to other schemes. In this paper, we compare 
the performance in our scheme to previous schemes. 



2 Proposed Scheme 

We define the LSTP area as the one composed of the RAs which the correspond- 
ing MSCs serve. Then we statically group the VLRs in LSTP area in order to 
maintain the consistency rate high. Even though it is impossible to maintain the 
consistency at all times as the computer system does, we can effectively keep it 
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high to some extents by grouping method. The cache is used to reduce the call- 
tracking load and make the fast call setup possible. It is also possible to group 
RAs dynamically regardless of the LSTP area. Let define post VLR, Py lr which 
keeps the callee’s current location information as long as the callee moves within 
its LSTP area. Suppose that the Pvlr and the VLR which serves the callee’s 
RA are related to the physically different LSTPs. In this case, we should tolerate 
the signaling overhead even though the caller and callee belong to same dynamic 
group. A lot of signaling messages for registering a location and tracking a call 
is transmitted via RSTP instead of LSTP. If the cost of transmitting the signal- 
ing messages via RSTP is large enough compared to LSTP, dynamic grouping 
method may degrade the performance although it solves the Ping-Pong effect. 
Furthermore, it is critical in case that RSTP is bottlenecked. 



Old Serving System 



New Serving System 







REGNOT 




(a) Before Pvlr is changed 



Old Serving System New Serving System 




(b) After Pvlr is changed 

Fig. 2. Location registration in LSTP level caching scheme 



If a call originates and the entry of the callee’s location exists in LSTP level 
cache, it is delivered to the Pvlr- We note that we don’t have to consider where 
the callee is currently. It is because the Pvlr keeps the callee’s current location 
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as long as the callee moves within its LSTP area. Without the terminal move- 
ments into a new LSTP area, there is no HLR registration. Fig. 2 shows the 
message flow due to the location registration according to the status of Pvlr 
change. 

In our scheme, the location information of terminals is stored in the cache in 
LSTP level, where the number of call requests to those terminals are over thresh- 
old (K). The number of call requests are counted in the Pvlr or the VLR which 
serves the callee’s RA. If we take the small K, there are a lot of the entries in 
cache but the possibility that the cached information and the callee’s current 
location are inconsistent is increased. It is because of the terminals of which 
call-to-mobility ratios (CMRs) are very low. To the contrary, if we take the large 
K, the number of terminals of which location information is referred is decreased 
but the consistency is maintained high. As for the hierarchical level in which the 
cache is located, the MSC level is not desirous considering the general call pat- 
terns. It is effective only for the users of which working and resident areas are 
regionally very limited, i.e., one or two RAs. When a call is tracked using the 
cached information, the most important thing is how to maintain the ICR low 
to some extent that the tracking using the cache is cost effective compared to 
otherwise. In our scheme, the terminals with high rate of call request regardless 
of the RA crossings within LSTP area are stored in cache. As for the cached 
terminal, it is questionable to separate the terminals by CMR. Suppose that the 
terminal A receive only one call and cross a RA only one time and the terminal 
B receive a lot of calls (n) and cross the RAs n times. Then the CMRs of two 
terminals are same but it is desirous that the location information of terminal B 
is stored in the cache. As for caching effect, the improvement of the performance 
depends on the reference rate of the user location information more than the 
number of entries in cache, i.e., the cache size. The object of locating cache in 
LSTP level is to extend the domain of the region in which the callee’s location 
information is referred for a lot of calls. 

The algorithm to track a call using the redial and LSTP level caches is shown 
below. 

LSTP_Cache_Tracking () 

{Call to mobile user is detected at local switch; 

If callee is in its RA then 
Return; 

Redial_Cache ( ) ; 

/ ^optional*/ 

If there is an entry for callee in LSTP_Cache then 
/♦cache hit*/ 
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{Query corresponding PVLR specified in cache entry; 

If there exists the entry for callee in PVLR then 

Callee VLR, V maintained in PVLR returns callee ’s 
location info, to calling switch; 
else 

/♦inconsistent state*/ 

Switch which serves PVLR continues Basic_Tracking ( ) ; 

} 

else 

/♦cache miss*/ 

continue Basic_Tracking (); 

/♦ Basic_Tracking () is used in IS-41 ♦/ 



} 



Using the above algorithms, we support the locality shown in user’s general 
call behavior. The tracking steps are depicted in Fig. 3. 



L 




r' t 




RSTP 


1 




• L_Cache : Cache in LSTP level 

• RD_Cache : Redial Cache 

Note : (7), (8), (9), and (10) occur in inconsistent state 
(*) implies cache miss 



FigS.Call tracking in LSTP level caching scheme 
In Fig. 3, we describes the call tracking steps in detail. 



(1) Call originates ^ VLR 

If callee is in its RA then (2) 

(2) Return 
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(3) If callee is found in RD_Cache entry, then (5) 

Else (4) 

(4) If callee is found in L_Cache, then (5) 

Else (9) 

(5) Query PVLR 

(6) If callee exists in PVLR entry, then(7) 

Else(8) 

{7)V LRcaiiee retum the routing info. To MSCcaiiee 

(8) Execute the IS-41call tracking mechanism (step 9 is initiated) 

(9) Query HLR 

(10) Query PVLR to which HLR points for callee’s current location 
info. 

{11)V LRcaiiee to which PVLR points returns the routing info. To 
MSC_callee 



3 Performance Analysis 

For numerical analysis, we adopt hexagon model as geometrical RA model. Gen- 
erally, it is assumed that a VLR serves one RA. 




Fig.4.RA model with hexagon structure 



As shown in Fig. 4, RAs in LSTP area can be grouped. For example, there are 
7 RAs in circle 1 area and 19 RAs in circle 2 area. The terminals in RAs inside 
circle n area still exist in circle n area after their first RA crossings. That is, the 
terminals inside circle area cannot cross their LSTP area when they cross the 
RA one time. While the terminals in RAs which meet the line of the circle can 
move out from their LSTP area. We can simply calculate the number of moving 
out terminals. Intuitively, the terminals in arrow marked areas move out in Fig. 
4. Using the number of outside edges in arrow marked polygon, we can compute 
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the number of terminals which move out from the LSTP area as follows. 

.TotalNo.ofoutsideedgesinarowmarkedpolygons. 

^ No.ofedgeso f hexagon x No.of RAsinLST Pare 

X NO .of terminal sinL ST Parea 

In Fig. 4, the number of terminals which move out from LSTP area is the terminal 
in LSTP area x 5/19. The number of VLRs in LSTP area represented as circle 
n is generalized as follows. 

No.ofVLRsinLSTParea = 1 + 3n{n + 1) {where n = l,2,...) (2) 

The rate of terminals which move out from LSTP area can be generalized as 
follows. 



Rniove^out.No.ofVLRsinLStparea — 

{{6x2^~^)/2) /No.of VLRs inLSTParea {where n = l,2,...) (3) 

The RA is said to be locally related when they belong to the same LSTP area and 
remotely related when the one of them belongs to the different LSTP area. The 
terminal’s RA crossings should be classified according to the local and remote 
relations in the following schemes. 

1. LA scheme 

— Relation of LA serving RA and callee’s last visiting RA 
— Relation of callee’s last visiting RA and callee’s current RA 

2. FP scheme 

— Relation of RA (where T = 0) and next visiting RA (where T = 1) 

— Relations of intermediate RAs in moving route of the terminal 
— Relation of previous RA (where T = n - 1) and callee’s current RA 
(Supposed that callee is currently located in the RA where T = n) 

3. Proposed scheme 

— Relation of PVLR serving RA and callee’s last visiting RA 
— Relation of RA where callee’s last visiting RA and callee’s current RA 

We define the probabilities that a terminal moves within the LSTP area and 
crosses the LSTP area as P (local) and P (remote), respectively. The above 
relations are determined according to the terminal’s moving patterns. 

P{loCal) = Rmove^in.No.of V LRsinLST Parea 
P{remote) = Rmove^out.N o.ofV LRsinLSTParea (4) 

P (local) and P (remote) are varied according to the number of VLRs in LSTP 
area. Consider the terminals n time RA changes, where the number of RAs 
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in LSTP area is 7. If a terminal moves into a new LSTP area in move- 
ment, the terminal is located in one of outside RAs - 6 RAs - of the new LSTP 
area. Therefore, when the terminal moves (n -I- 1)*^ times, two probabilities, 

RmoveJ,n,No.ofVLRsinLSTParea and Rrnove^out^N o.ofV LRsinLST Parea ^^ce both 3/6. 

If the terminals (n -I- 1)*^ movement occurs within the LSTP area, the proba- 
bilities of terminals (n -I- 2)*^ movement are the two probabilities are 4/7, 3/7 
, respectively. Otherwise, the terminal moves into a new LSTP area again. The 
two probabilities are both 3/6. 

3.1 Location registration cost 

If a terminal moves within the LSTP area - in case of the local relation -, the 
PVLR is updated and the entry in old VLR is cancelled. If a terminal moves 
into the RA in a new LSTP area, the HLR is updated. And then the entries in 
old PVLR and old VLR are cancelled. We define the signaling costs (SCs) as 
follows. 

SCI : Cost of transmitting a message from one VLR to another VLR 
through HLR 

SC2 : Cost of transmitting a message from one VLR to another VLR 
through RSTP 

SCS: Cost of transmitting a message from one VLR to another VLR 
through LSTP 

We evaluate the performance according to the relative values of SCI, SC2, and 
SC3 which need for location registration. Generally, we can assume SC3 SC2 < 
SCI or SC3 SC2 << SCI. Even though the absolute values can not determined, 
the difference of relative values can be computed. 



Case 1: SC3 < SC2 < SCI 
Case 2: SC3, SC2 « SCI 
Case 3:SC3 « SC2,SC1 
Case 4: SC3, SC2 < SCI 

The registration cost for LSTP level caching scheme is computed as follows. 

C-LSTPlevelcaching, Loc.Reg=KjLnoveAn,NO.of VLRs in LSTP 

areax (2x2SC3)-l- 

Rmove^out,No.ofVLRsinLSTParea ^ 2(S*d -p SC3^ (5) 

The registration cost set (RCS) consists of SCI, SC2, and SC3. We note that the 
larger RCS number in following figures is, the relative value difference of SC2 
and SC3 to SCI is smaller. It is applied to above 4 cases. In Fig. 5, registration 
cost is reduced according to the number of the VLRs in LSTP area. In case that 
terminals average speed is fixed and Rmove.out,N o.ofv LRsinLSTParea is less, the 
HLR access cost is reduced. Therefore, the more VLRs in the LSTP area are, 
the performance is improved result. 
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The difference between Fig. 5 (a) and (b) results from the reduced HLR access 
traffic according to the VLRs in LSTP area. 

Fig. 6 (a) shows that LSTP level caching scheme shows the better results compare 
to LA scheme in case of l/M_no. >= 0.5 where M_no is the number of terminal 
RA crossings before call request. The more VLRs are in LSTP area, the smaller 
l/M_no can be taken for similar performances in two schemes. This implies that 
the performance in LSTP level caching scheme with large number of VLRs is 
prepositional to the effectiveness in aspect of registration cost when LA scheme 
is applied in terminals high mobility environments. 




(a) LSTP level caching scheme vs. LA scheme 



Registrarion Cost (7 VLRs in LSTP area) 




(b) LSTP level caching scheme vs. FP scheme 
Fig. 6. Registration cost comparison 



Compared to FP scheme in which T is 2, 3, respectively, LSTP level caching 
scheme shows the better results in both cases. It is mainly caused by degenerating 
all VLRs in terminals moving route when T is reset in FP scheme. In case of 
2 movements in Fig. 6 (b), the performance in T = I is better than the one 
in T = 3 when RCS 3~10 are applied. We anticipate that the registration cost 
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for FP scheme is decreased during terminals (T - 1) RA crossings but increased 
suddenly when terminals RA crossing compared to other schemes. This effect 
becomes greater with larger T. 

3.2 Call tracking cost 

We define the rate of the cached terminals of which the number of calLcount is 
over threshold (K) in Pvlr as 

Rno.ofCached.TRs ' If there exists an entry in LSTP level cache for the subsequent 
requested call, the corresponding Pvlr, is queried. If not, HLR is queried and 
the pointed VLR in HLR entry is queried subsequently. A call is said to originate 
locally if the caller and callee is located in same LSTP area. Otherwise, it is said 
to originate remotely. In call tracking, we consider those two call types and 
classify each call type into two cases of consistent call and inconsistent one. We 
define ICR, which is the ratio that the cached information and the entry in 
PVLR are inconsistent. Therefore, ICR is the ratio of the number of terminals 
which move out from their LSTP area to the number of terminals in LSTP area 
to which calls are requested. Described simply, it is the rate of terminals which 
move out from their LSTP area if the calls are requested to all terminals in 
LSTP area. ICR is written as follows. 

ICR — {Pmove^out,LSTParea ^ Rreg.V L r) 

{RcaiiOrig.jhr/Tr X [No.of Terminals /V LR)) (6) 

In equation (6), Rreg.,VLR.^s directly proportional to the terminals average speed. 
If terminals average speed is fixed, then the more calls originate, the less ICR is. 
The call tracking cost is influenced by the number of VLRs in LSTP area, the 
terminals moving speed, the call origination rate, and cache reference rate. In 
other words, the caching effect is dependent on HR in LSTP level cache and the 
ICR in callee’s LSTP area. We mentioned above that the call relations between 
caller and callee should be considered. According to the relations, the message- 
transmitting route via LSTP or RSTP is determined. The followings are the 
generalized tracking costs according to the call relations. 



— Tracking cost when caller and callee are locally related: 



C LST Plevelcaching^CallTracking — (1 RNO.ofcachedTRs) 



CiS — 41. CallTracking T Rn o.of cachedT Rs 

[HR X {(1 - ICR) X 2S’C'3 -h ICR x 2(5C1 -h S'C3)}-f 

(1 - HR) X 2SCI] + Ceache (7) 



Tracking cost when caller and callee are remotely related: 
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C LST Plevelcaching.CallTracking — (1 Rn o.of cachedTRs) 



C’ J S ~41^CallTracking o.of CachedT Rs 

[HR{{1 - ICR) X 2SC2 + ICR x 2{SCl + SC2)}+ 



(1 - HR) X 2SC1 + Ccache 



(8) 



Call Tracking Cost 




(a) SC3, SC2 < SCI 



Call Tacking Cost 




123456789 10 



Tracking Cost Set (TCS) 



Remote call ; (HR ; 1.0) 


Remote call : (HR ; 0.9) 


Remote call : (HR ; 0.8) 


Remote call ; (HR : 0.7) 


...A... Local eall : (HR : 1.0) 


-O- Local call : (HR : 0.9) 


Local call: (HR: 0.8) 


Local call: (HR: 0.7) 





(b) SC3, SC2 << SCI 

Fig. 7. TCS classification (9 VLRs in LSTP area) 



The tracking cost is reduced according to the number of the VLRs in LSTP 
area. It implies that the more VLRs in LSTP area are, the lower 
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Rmove.out,No.ofVLRsinLSTParea^s and the lower ICR is Subsequently. The perfor- 
mance is dependent on the ICR as well as the HR and the influence of terminals 
moving speed on ICR. Like RCS in section 3.1, the larger tracking cost set (TCS) 
number is, the relative value difference of SC2 and SC3 to SCI is smaller. Unlike 
the registration cost, SC2 has influence on the tracking cost. The influence of 
SC2 varies according to the rate of the remote calls to the total ones. If the 
difference between SC2 and SC3 is very small, we don’t have to classify the local 
and remote calls. The reduced traffic in Fig. 7 (b) results from relatively smaller 
SC2 and SC3 than those in (a). The performances in cases of local call and re- 
mote call are shown separately. Note that the call relation should be considered 
separately. The degree of the call locality is related to the size of area in which 
the calls generate to a terminal. Considering the geographical region to which 
caller and callee belong, the caller’s moving speed heavily affects the call relation 
but little affects call locality when the cache is in LSTP level. With the LSTP 
level caching, we can get over the limit of the degree of call locality. If so, the 
reference rate of the callee’s location information is increased. 



3.3 Traffic comparison 

In performance evaluation, the rate of the local calls is assumed as 0.7. As we 
simulated with various rates, there were slight changes in the results. In fact, 
the performance is mainly dependent on SCI instead of SC2 or SC3. If the value 
difference between SC2 and SC3 is not great, we can regard them as the same 
cost terms. If the HR is less in local and remote calls, the relative portion of SCI 
to SC2 and SC3 in tracking a call is increased. In case of cache miss, we submit 
to the call tracking mechanism in IS-41 scheme. 



Comparison of Call Tracking Cost (7 VLR in LSTP area) 




Tracking Cost Set(TCS) 

□ IS41 □ LA □ FP : (T = 3) 

■ FP : (T = 4) ■ LSTP Level Caching (HR : 1 .oP LSTP Level Caching (HR : 0.9) 

M LSTP Level Caching (HR : 0.8) 



Fig. 8. Tracking cost comparison (where SC3, SC2 << SCI) 
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In Fig. 8, the costs are decreased in order as FP > LA > IS-41 > LSTP 
level caching. LA scheme has one more traverse to track a call compared to 
IS-41 scheme. The difference between two costs in FP scheme is due to one 
more traverse via LSTP or RSTP. Additionally, the cost for one more traverse 
varies according to the local or remote relation between callee’s RA and the 
previous RA in tracking route. T should be taken considering cost effectiveness 
in call tracking and the degeneration overhead together. If T is too large, we 
should tolerate the setup delay too. Although the registration costs in LA and 
FP schemes vary according to " Mjno." , T, respectively, those two schemes are 
cost effective compared to IS-41 scheme. However, as for tracking cost, those two 
schemes degrade the performance due to the inherent additional traverses and 
degeneration overhead. The larger the relative value difference of SC2 and SC3 
to SCI is, the performance in LSTP level caching scheme is improved result com- 
pared to those in other schemes. In the environment with high call origination 
rate, LA and FP schemes degrade the performance and delay the setup time. 
Therefore, those schemes should be adopted considering the trade-off between 
the registration cost and call tracking one. 

4 Conclusions 

Previous schemes are to decrease the registration cost due to terminals RA 
crossing or the call tracking cost due to the call origination [2, 5, 6, 7, 8]. In those 
schemes, however, the trade-off between the registration cost and call tracking 
cost is an issue. Although a few schemes try to reduce the signaling traffic due to 
tracking calls, many schemes mainly focused on reducing the signaling traffic due 
to the location registration. We propose a solution to improve the performance 
in previous schemes including IS-41 and GSM standard. Our scheme is mainly 
focused on supporting the CL efficiently. The object that we locate the cache in 
LSTP level is to reuse the location information of the callee to which there are 
lots of call requests and furthermore to increase the reference rate. In addition, 
the RAs in LSTP area are grouped statically to decrease the ICR and mitigate 
the traffic bottleneck in HLR and RSTP, and also decrease the signaling traffic 
caused by terminals frequent RA crossings. As a result of cost evaluation, the 
more the VLRs in LSTP area are and the higher the call origination rate is, the 
performance is improved result. The terminals frequent RA crossings compared 
to the previous schemes including IS-41 and GSM standard much less affect our 
scheme. 
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Abstract. Although theoretical methods were proposed for tolerating 
multiple disk failures, their complexities are too high to be practically 
applicable. In this paper, we proposed a practical parity scheme for tol- 
erating simultaneous triple disk failures in RAID architectures. We first 
formalized the problems with matrix operations. Our scheme is practical 
in the sense that it employs three redundant disks for tolerating triple 
disk failures. Furthermore, it requires simple arithmetic computations 
only, and it can easily be implemented on current RAID controllers. 



1 Introduction 

Computers are changing rapidly in terms of its application areas and the per- 
formance level. Even two or three year old computers are becoming obsolete in 
the worst case for the customer’s point of view. Meanwhile, the I/O systems 
and the storage systems are lagging behind the performance increase of the pro- 
cessors. As the gap between the performance of processors and I/O systems are 
becoming large, the performance of a computer system will eventually be limited 
by the performance of I/0[l],[2]. Therefore, it is important to balance the I/O 
performance and the capability of storage systems with the processing power in 
order to improve the overall performance of a computer system. 

Improving I/O performance [2], [3], [4], known as data clustering and disk strip- 
ing in disk array systems, has been one of the main research topics for computer 
architects in recent years. Disk array systems utilize the parallelism among many 
disks in a system. Furthermore, we need more disks for storing new forms of data 
like audio, video and fax. However, incorporating such a large number of disks 
into a system makes the disk system vulnerable to failure than a single disk 
case [3]. Hence, designing a disk array system which can keep data correct while 
tolerating disk failures is essential. 

Patterson et al.[3] have proposed Redundant Array of Inexpensive Disks 
(RAID). They defined five different levels of RAID structures(RAID level 1~5) 
depending on the data and parity placement scheme. Basically RAID tolerates 
only one disk failure, data may be lost when two or more disks fail. 



J. He and M. Sato (Eds.): ASIAN 2000, LNCS 1961, pp. 58—68, 2000. 
(c) Springer- Verlag Berlin Heidelberg 2000 
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Although error-correcting scheme such as Hamming and Reed-Solomon 
codes[5] were used to tolerate multiple disk failures, its implementation detail 
was extremely complex. Thus, it would be desirable to have codes doing relatively 
simple operations, such as exclusive-ORs or simple arithmetic calculations. Some 
convolutional type codes[6] were also used for tolerating disk failures. When the 
error correcting capacity of the code is broken, however, there happens infinite 
error propagation. Therefore, we need a block type code having simple opera- 
tions. 

Theoretically, such coding scheme was proposed in [7], [8], [9] and later was 
generalized in [10] for multiple erasures, where an erasure is an error whose 
location is known. However, one problem remains. The encoding or decoding 
scheme is very complex[ll]. Although multiple erasure correcting scheme was 
implement practically in [12], [13], it could tolerate only double failures. 

These days, there is a growing demand in high reliable storage systems for 
user data, RAID has to be extended to recover data when two or more disks 
fail. Hence, in order to satisfy the requirement, we need a practical scheme to 
tolerate more disk failures. 

In this paper, we proposed a practical scheme for tolerating triple disk failures 
in RAID architectures. We first formalized the problems with matrix operations. 
Our scheme is practical in the sense that it employs three redundant disks for tol- 
erating single, double and triple disk failures simultaneously. We present a simple 
encoding procedure that is based on exclusive-OR operations and independent 
parities. Therefore, our encoding scheme has no recursion such as convolutional 
type codes. We also present a practical decoding procedure for tolerating triple 
erasures as well as single and double errors. 

The paper is organized as follows. In the next section, we describe the encod- 
ing procedure used in our scheme. In Section 3, we present the corresponding 
decoding procedure. Then, we consider the implementing our scheme on stan- 
dard RAID controller in Section 4. Finally, we present some concluding remarks 
in Section 5. 



2 Encoding 



For the purpose of explanation, the following assumptions are made. First, we 
assume that there are m-|-3 disks with the information stored in the first m disks 
while the redundant data are stored in the last three disks. We also assume that 
m, the number of information disks, is a prime number. However, if we want to 
store an arbitrary(not necessarily prime) number of disks, we can take the next 
prime number following this arbitrary number and assume that the extra disks 
have no information. This means that all the information bits are 0. These disks 
are not actually used, there are only for computation. 

Our procedure works for disks with arbitrary capacity by treating each block 
of m — 1 symbols separately. For simplicity, in some of our examples, we will 
assume that each symbol is a bit. In some applications, a symbol may be as 
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big as a 512 byte disk sector. It is not required to assume that the symbols are 
binary. 

Based on the assumptions, the problem of tolerating triple disk failures can 
be described as follows. First, consider an (m — 1) x (m -|- 3) array, where m is a 
prime number. And, symbol ajj, 0 < i < m — 2, 0 < j < m + 2, is the zth symbol 
in the jth disk. For some applications, a column of array may be regarded as a 
disk and a symbol as a disk sector. The last three disks m, m+ 1, and m -I- 2 are 
the disks with the redundant information. 

We consider the notation that (n)m = j is if and only if j = n(mod m) and 
0 < j < m — 1. For instance, ( 7)5 = 2 and (— 2)5 = 3. 

m 

In addition, A= 0 a„ t means A = a„ ; © a„ ;+i 0 a„ ;+2 0 • • • 0 a„ m and 0 

t=i 

represents modular addition. 

We also assume that there is an imaginary 0 row after the last, such as 
dm-i.j = 0, 0 < J < m — I, which consider that the array is now an m x (m + 2) 
array. 

Then, for each 1,0 < I < m — 2, the redundant symbols are computed as 
follows: 



m— 1 

^l,m — © 


(1) 




m— 1 


(2) 


= 0 0.n^t) t 
t=0 ^ ' 


m— 1 


(3) 


0-l,m+2 = 



Equations (I), (2) and (3) define the encoding procedure. We have three 
types of redundancy: one for the horizontal redundancy and two for the vertical 
redundancies. Disk m is simply the exclusive-OR of disks 0, I, - ■ • , m — 1. Its 
contents are exactly the same as those of the parity. Disk (m © 1) carries the 
vertical redundancy according to ( 2 ) and disk (m + 2 ) carries another vertical 
redundancy according to (3). 

Let us look closely at equations (I), (2) and (3). We know that the horizontal 
redundancy is a simple parity of row symbol disks. Also, the first vertical redun- 
dancy is the parity of column symbol disks which is the modular addition of one 
cyclic shift of the 1 st row, two cyclic shifts of the 2 nd row, three cyclic shifts 
of the 3rd row,- ■ • , and (m — 1) cyclic shifts of the (m — l)th row. The second 
vertical redundancy is the parity of column symbol disks which is the modular 
addition of two cyclic shifts of the 1st row, 4 cyclic shifts of the 2nd row, 6 cyclic 
shifts of the 3rd row,- - - , and 2 x (m — 1) cyclic shifts of the (m — l)th row. The 
(m — 1) X (m 0 3) array defined above can recover the information lost in any 
of the three columns. 

We can show the encoding procedure by hgures at m = 5 for example. We 
will show how to make the horizontal redundancy in Fig. 1, the first vertical 
redundancy in Fig. 2, and the second vertical redundancy in Fig. 3. 
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Fig. 1. Calculation of horizontal parity 




Fig. 2. Calculation of the 1st vertical parity 




Fig. 3. Calculation of the 2nd vertical parity 



As we can see, our encoding procedure is very simple and the circuits imple- 
menting (1), (2) and (3) are straightforward. More generally, we can implement 
(1), (2) and (3) in software with exclusive-OR hardware of the current RAID 
controllers. 

The following example illustrates the encoding procedure for m = 5. 
Example 1: Let m=5, and let the symbols be denoted by aij, 0 < i < 3, 
0 < j < 7. The redundant symbols are in column 5, 6 and 7. A practical 
implementation of this example considers 8 disks numbered 0 to 7, each disk has 
4 disk sectors. The data sectors are on disks numbered 0, 1, 2, 3 and 4, and the 
redundant disk sectors are on disks numbered 5, 6 and 7. Then, according to 
(1), (2) and (3), the redundant symbols are computed as follows: 
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o;,5 = 0 - 1,0 ® «;,i ® «;,2 ® a/,3 ® a/, 4 , 0 < ^ < 3 

ao,6 = ao,0 ® ®3,2 ® 02,3 ® a/,4 
ai,6 = ai,o ® ao,i © 03,3 © 02,4 
a2,6 = a2,o ® ai,i © ao,2 ® as, 4 
as,6 = as,o ® 02,1 © ai,2 ® ao,s 
ao,7 = ao,o ® a 2 ,i ® ai,s © 03,4 
ai,7 = a3,o ® ao,i ffi 02,2 ® ai,4 

02.7 = ai,o ® as,i © ao,2 ® 02,3 

03.7 = ai,i ® 03,2 © ao ,3 ® 02,4 

For instance, assume that we want to encode the 5 columns shown in Fig. 4. 
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Fig. 4. An example of 4 x 5 array for encoding. 

We have filled up the last three columns with encoded symbols. The encoding 
procedure computes the following array shown in Fig. 5. 
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Fig. 5. Encoding result of Fig. 1. 



The Proposed encoding scheme satisfies the following theorems for compu- 
tation amount and memory space requirement. 

Theorem 1: Given m information disks, redundant data for tolerating triple 

disk failures can be computed in O(m^) time by using our encoding procedure. 

Proof: Equations (1), (2), and (3) require 0{m) time, and each equation should 
be executed for m times. Therefore, the total time for encoding is 

Corollary 1: Given m information disks, redundant data for tolerating triple 

disk failures can be computed in O(m^) space by using our encoding procedure. 
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Proof: Additional memory space requirement is determined by the size of the 
matrix, (m — 1) x (m + 3). Thus, the memory space requirement for the encoding 
procedure is bounded by O(m^). 

3 Decoding 

The essential part of our scheme is the decoding algorithm for triple erasures. 
Our algorithm can be implemented either in software or in hardware, depending 
on the application. It will be executed when a disk fails, when two disks fail 
simultaneously, or when three disks fail simultaneously. We assume an (m — 
l)(m + 3) array for decod-ing shown in Fig. 6. In the following, we present the 
algorithm which in effect corrects triple erasures. 
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Fig. 6. An (m — 1) X (m + 3) array for decoding 



We assume that columns (disk) i, j and k have failed, where i < j < k. Then, 
we can compute the number of (m — 1) x 3 equations by equation (1), (2) and 
(3). We have (m — 1) x 3 unknown values and (m — 1) x 3 independent equations 
such as (4), (5) and (6). Thus, we can get the solutions by applying the matrix 
equation[14]. 



«o,o © oo,i ® ao,2 ® • • • ® ao,i ® • • • ® ao,j ® • • • ® ao,k © ■ ■ ■ ® ao,m-i = oo,m 

fli,o © fli,i ® ai,2 ® • • • ® oi,i ® • • • © 0-1 J ® • • • © ai,fe © • ■ • © ai,m— 1 = ai,m 

<i2,0 © 02,1 ® 02,2 ® ■ ■ ■ © 02, i ® ' ' ' © 02, j © ' ' ' © 02, fe © ■ ■ ■ © 02,m— 1 = 02, m 

Om — 2,0 © Om — 2,1 © Om — 2,2 © ' * * © Om — 2,1 © * * * © Om — 2,j © * * * © Ot^ — 2,fc© 

■ ■ ■ © Om — 2,m — 1 — Om — 2,m 

Oo,0 © Om — 1,1 © Om — 2,2 © ' ‘ ' © O^ — 1)^,1 © * * ■ © 0^ — j^^,j © * * * © Om — k,k © ' * * 

©Ol,m — 1 — Oo,m+1 

01 .0 © Oo,l ® Om- 1,2 © ■ ■ ■ © © ■ ■ ■ © J © ■ ■ ■ © a(_i-k}rri.,k © ' ' ' 

©02,m-l = Ol,m 

02.0 © Ol,l ® Oo,2 © ■ ■ ■ © 0(2-i>m,i ® ' ' ' © 0(2— © ' ' ' © 0(2— © ■ ■ ■ 

©03,m-l = 02,m+l 



(4) 



( 5 ) 



Oj7n — 2, 0 ©o m — 2,1 ©o m — 2, 20' ■ ■0Q'( ■©0( 

m — 2—j)jn iJ 0- ■ ■0a(^_2-fc)^,fc0 

• • • 0 0-0, m — 1 — <3-772 — 2, m+1 
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U0,0©fl2,l ©04,2 ©■ ■ ■©0(2i>„ ,i ©' ' '©0(2i>„ ,J ® ' ' ' © 0{2fc)m,fc ® ' ' '©®(2(m— 1))„ ,m-l 

— Q'0,m+2 

O(-2)„,0 © O0,l © 02,2 ® • • • © 0(2(i-l))„,i © ' ' ' © 0(2(j-l))m.i © ' ' ' © 0(2(fc-l))„,fc 

© • • • ® 0(2(m-2))„,m-l = Ol,m-|-2 

®(-4>m,0©O(_2)„,l©O0,2®-' • ©0(2(i-2))„,i © ' ' ' ® 0(2(j-2)>„ ,j © ' ' ' © 0(2(fe-2))„ ,fe (6) 

© • • • © 0^2(m-3))„,m-l = 02,m-|-2 

04,0 © 06,1 © Og,2 © • • ■ © 0(2(i-|-2))„,i © ' ' ' © 0<2(i+2))m.i © ' ' ' © 0(2(fc-|-2))„ ,fe © ' ' ' 

©02,m-l = Om-2,m+2 



We can form [^] by equations (4), (5) and (6), and a matrix equation (7) 
as follows. Then, we can make matrix [(?] of (7) to identity matrix [/], thus 
presenting matrix [f2] of (8) be the values of unknown symbols. 
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(8) 



The following example illustrates the decoding procedure for m=5. 



Example 2: We assume that m = 5, as in Example 1. Also, we assume the 
following array where columns 1, 3 and 4, represented as symbol x have been 
erased shown in Fig. 7. 



1 


X 


1 


X 


X 


1 


1 


1 


0 


X 


1 


X 


X 


0 


1 


0 


1 


X 


0 


X 


X 


0 


0 


0 


0 


X 


0 


X 


X 


1 


1 


0 



Fig. 7. An example of 3 disks erases. 
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We can make the following 12 equations from (1), (2) and (3): 



1 = 1© O0,l © 1 © O0,3 © fl0,4 1 = 1 © 0 © 02,3 © Ol,4 1 = 1© 02,1 © Ol,3 © 03,4 

0 = 0® Ol,l ® 1 © Ol,3 © Ol,4 1=0© Oo,l © 03,3 © 02,4 0 = 0® Oq ,1 © 0 © Ol,4 (9) 

0=1® 02,1 © 0 © 02,3 © 02,4 0 = 1 © Ol,l © 1 © 03,4 0 = 0® 03,1 © 1 © 02,3 

1=0© 03,1 © 0 © 03,3 © 03,4 1=0© 02,1 © 1 © Oq,3 0 = Ol,l © 0 © Oq,3 © 02,4 



Then, we have 12 unknown symbols, and also have 12 equations. Thus, we 
can make the following matrix equation (10) from (9). The equation (10) can be 
written as (11). 
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So we can solve the unknown symbols from (11). Finally, the reconstructed 
array is shown in Fig. 8. 
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Fig. 8. Reconstruction of 3 disks erases. 



We can also reconstruct the single and double disks failures by (4) through 
(8). In those cases, we only care less unknown values of (m — 2) or 2 x (m — 2). 

Additionally, we can consider the following special cases for the positions of 
failed disks. In these cases, we can reduce the computation by applying encoding 
procedure instead of matrix operations for decoding. 
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Triple disk failures are considered. 

• m<i, j, k<m-|-2: all of the redundant disks have failed. We can reconstruct 
disk m using (1), disk m+1 using (2), and disk m+2 using (3). In other 
word, the reconstruction is the same as encoding. 

• i<m and m-|-l<j,k<m-|-2: one information disk and two vertical redundant 
disks have failed. We can reconstruct the information disk by equation (1), 
and the redundant disks by (2) and (3). 

Double disk failures are considered. 

• m-|-l<i, j<m-|-2: vertical redundant disks have failed. We can reconstruct disk 

m-|-l using (2) and disk m-|-2 using (3). In other word, the reconstruction is 
the same as encoding. 

• i=m and m-|-I<j<m-|-2: one horizontal redundant disk and one vertical re- 
dundant disk have failed. We can reconstruct the horizontal redundant disk 
by equation (1), and the redundant disks by (2) or (3). 

• i<m and m-|-l<j<m-|-2: one information disk and one vertical rednndant disk 

have failed. We can reconstruct the horizontal redundant disk by equation 
(2) or (3). 

Finally, single disk failure is considered. 

• m < i < m-|-2: one redundant disk has failed. We can reconstruct disk m 
using (1), disk m-l-1 using (2), or disk m-|-2 using (3). In other word, the 
reconstruction is the same as encoding. 

• i<m: one information disk has failed. We can reconstruct the information 
disk by equation (1). 

The proposed decoding scheme satisfies the following theorems for compu- 
tation amount and memory space requirement. 

Theorem 2: Given m information disks, triple disk failures can be recovered 

in 0{m^) time by using our decoding procedure. 

Proof: Equation (8) can be solved in 0{m^) time. Therefore, the total time for 
decoding is O(m^). 

Corollary 2: Given m information disks, triple disk failures can be recovered 

in 0{mf) space by using our decoding procedure. 

Proof: Additional memory space requirement is determined by the size of (m — 
1) X (m -|- 3) matrix. Thus, the memory space requirement for the decoding 
procedure is bounded by O(m^). 
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4 Implementing on standard RAID controller 

The configuration of standard RAID system consists of the disk pool and the 
RAID controller such as Fig. 9. The RAID controller includes driver software 
and the parity controller for parity management, and the RAID controller can 
be implemented by software or hardware. 

For implementing our scheme, we need to change the parity control algorithm 
such as in Fig. 10. Normally, the parity controller implements XOR function in 
hardware and calculation function in software, so we can implement our scheme 
on the calculation function in software without hardware change. 



RAID Controller 

Parity Controller 




I ' I I ' I ' Tl. 



_L 



RAID Controller 



Parity Controller for 
Implementing Our Scheme 



I 



_[ 


— 




T 


I 


~n 1 — 


1 


'i 




u 










1 


[ 


J 


' 1 


1 1 1 




1 


1 




1 


1 


I I 


T 




( 






t d 1 


'll '1 1 




II II 1 . 1 


J 1 




J 


j 




J 


1 


J J 


J 




. 


1 f 




f '1 1 


■ 1 r ■ i 1 








1 






1[ 1 




_L 


_L 




_L 


I 


X 


X 


11 if 1 \ i 1 


r -11 




.11 ,.J 1 ...I 


. ) 1 1 





Data Disks Parity Disks 



Data Disks 



Parity Disks 



Fig. 9. Configuration of stan- Fig. 10. Configuration of proposed algorithm 
dard RAID system implemented in standard RAID controller 



5 Concluding Remarks 

In this paper, we presented a practical scheme for tolerating simultaneous triple 
disk failures in RAID architectures. We first formalized the problem by using 
matrix operations. Then we showed the encoding and the decoding schemes in 
detail. The scheme requires simple arithmetic computations and can be easily 
implemented on standard RAID controllers without any hardware change. The 
proposed scheme can be computed in O(m^) time, whereas the theoretical ap- 
proach requires O(m^) time. 

The symbols can be any size, from bit to multiple sectors. In some previous 
methods, convolutional type codes were used and an error in decoding propa- 
gated indefinitely. However, our scheme employs block type codes, which do not 
cause this problem. 

Although we illustrated the RAID architectures in this paper as an example 
of target applications, our encoding/ decoding scheme can also be used in multiple 
track recording applications, such as magnetic recording. 
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Abstract. In this paper, we investigate the extended cell assignment 
problem which optimally assigns new and split cells in PCS (Personal 
Communication Service) to switches in a wireless ATM network. Given 
cells and switches in an ATM network (whose locations are fixed and 
known), the problem is assigning cells to switches in an optimum manner. 
We would like to do the assignment in as attempt to minimize a cost 
criterion. The cost has two components: one is the cost of handoffs that 
involve two switches, and the other is the cost of cabling. This problem 
is modeled as a complex integer programming problem and finding an 
optimal solution to this problem is NP-complete. A stochastic search 
method, based on a genetic approach is proposed to solve this problem. 
Simulation results show that genetic algorithm is robust for this problem. 



Keywords: wireless ATM, genetic algorithms, design of algorithm, assignment 
problem. 



1 Introduction. 

The rapid worldwide growth of digital wireless communication services motivates 
a new generation of mobile switching networks to serve as infrastructure for such 
services. Mobile network being deployed in the next few years should be capable 
of smooth migration to future broadband services based on high-speed wireless 
access technologies, such as wireless asynchronous transfer mode (ATM)[1]. The 
architecture shown in Fig. 1 was presented in [1]. In this architecture, the base 
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Fig. 1. Architecture of wireless ATM PCS. 



station controllers (BSCs) are omitted, and the base stations (BSs or cells) are 
directly connected to the ATM switches. The mobility functions supported by 
the BSCs will be moved to the BSs and/or the ATM switches. In this paper, we 
address the problem that is currently faced by designers of mobile communica- 
tion service and in the future, it is likely to be faced by designers of personal 
communication service (PCS). 

In the designing process of wireless ATM system, telephone company first 
estimated the usage of the mobile users and divided the global service area into 
many coverage areas. Second, the cellular system, base stations are constructed 
and connected to the switches in ATM network to form the topology of wireless 
ATM. But this topology may be out of date, since more and more users may use 
the PCS communication system. Some areas, which have not been covered in 
the originally designing plan may have mobile users to traverse on. The services 
requirement of some areas which covered by original cells may be increased and 
the capacities of the original cells may be exceeded. Though, the wireless ATM 
system must be extended such that the system can provide higher quantity of 
services to the mobile users. Two methods can be used to extend the capacities 
of system and provide higher quantity of services. The first one is: several cells 
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Fig. 2. Cell splitting. 



or base-stations (BSs) are built and added to the system such that the non- 
covered areas in the original wireless ATM network can be cover. The other 
is: in the cellular radio extending process, the capacity of a system may be 
increased by reducing the size of the cells so that the total number of channels 
available per unit area is increased. In practice, this is achieved by the process 
of cell spJitting[6], where new base stations are established at specific points in 
the cellular pattern, reducing the cell size by a factor of 2 or more as shown in 
Fig. 2. 

In this paper, we are given a two-level wireless ATM network as shown in 
Fig. 3. In the PCS network, cells are divided into two sets, one is the set of 
cells, which are built originally, and each cell in this set has been assigned to 
a switch in ATM network {e.g., cells ci, ci are assigned to switch S2, and cells 
C3, C4, and C5 are assigned to switch 54 in Fig. 3). The other is the set of cells, 
which are newly added (e.g., cq, cy, cg) or established by cell splitting {e.g., cg, 
cio, cii, C12, Ci3, and C14). Moreover, the locations of cells and switches are fixed 
and known. To simplify the discussion, we assumed that the number of cells and 
switches are fixed. The problem is to assign new and split cells to switches of the 
ATM network in an optimum manner. We would like to do the assignment in as 
attempt to minimize a cost criterion. The cost has two components: one is the 
cost of handoffs that involve two switches and the other is the cost of cabling 
(or trucking )[3][4][5]. 

Consider the example shown in Fig. 4, where cells A and B are connected 
to switch si, and cells C and D are connected to switch sg. If the subscriber 
moves from cell B to cell A, switch si will perform a handoff for this call. This 
handoff is relatively simple and does not involve any location update in the 
databases that record the position of the subscriber. The handoff also does not 
involve any network entity other than switch si. Now let support that the sub- 
scriber moves from cell B to cell C. Then the handoff involves the execution of a 
fairly complicated protocol between switches si and sg. In addition, the location 
of the subscriber in the databases has to be updated. There are two types of 
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PCS Network 



ATM Network 



Fig. 3. Two-level wireless ATM network 



handoffs: one involves only one switch and the other involves two switches. The 
handoffs that occur between two cells connected to different switches consume 
much more network resources (therefore, are much more costly) than those be- 
tween cells connected to the same switch[3][4][5]. Based on the discussion given 
in the previous paragraph, we assume that the cost of handoffs involving only 
one switch is negligible. Through this paper, we assume each cell to be connected 
to only one switch. 

In [5], Merchant and Sengupta had considered the cell assignment problem 
which assign cells to switches in PCS network, and they formulated the problem 
and proposed a heuristic algorithm to solve it such that the total cost can be 
minimized. The total cost consists of cabling cost and location update cost. The 
location update cost considered in [5] which only depends on the frequency of 
handoff between two switches is not realistic. Since the switch of ATM backbone 
is widely spread, the communication cost between two switches should be con- 
sidered in calculating the location update cost. In [3] [4], the model was extended 
to solve the problem that grouped cells into LAs and assign LAs to switches of 
ATM network in an optimum manner by considering the communication cost 
between two switches. A three phases heuristic algorithm and two genetic algo- 
rithms are proposed to solve assignment problem, respectively. In this paper, we 
follow the objective function which was formulated in [3] and [4] and solve the 
problem of assigning new and split cells to the switches of ATM network such 
that the total cost can be minimized; this problem is defined as extended cell 
assignment problem. 
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Fig. 4. Handoff from B to C is more expensive from B to A 



It is easy to see that finding an optimal solution to this problem is NP- 
complete, and that, an exact search for optimal solutions is impractical due to 
exponential growth in execution time. Moreover, traditional heuristic methods 
and greedy approaches should trap in local optima. Genetic algorithms (GA) 
have been touted as a class of general-purpose search strategies that strike a 
reasonable balance between explorations and exploitation. Genetic algorithms 
proposed by Richardson et al.[7], and Davis [2] have been constructed as robust 
stochastic search algorithms for various optimization problems. GA searches by 
exploiting information sampled from different regions of the solution space. The 
combination of crossover and mutations helps GA escape from local optima. 
These properties of GA provide a good global search methodology for the two- 
level wireless ATM network design problem. In this paper, we propose a genetic 
algorithm for optimal design the two-level wireless ATM network problem. 

The organization of this paper is as follows. In Section 2, we formally define 
the problem. The background of Genetic Algorithms is described in Section 3. 
In Section 4, we describe our genetic algorithm for the extended cells assignment 
problem. In Section 5, we give our experimental results. Finally, a conclusion is 
given in Section 6. 



2 Problem Formulation. 

Let CG(C,L) be the PCS network, where C is a finite set of cells with \C\ = n 
and L is the set of edges such that L C C xC. We assume that (7”®“ U = C, 

Qnew Q (jold _ 0^ (jnew j^g.^ gpjj^ ggjjg |(7 | = ^ ggjlg 

in (7"®“ have not yet been assigned to switches of ATM and 0°*®* be the set 
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of original cells where \C \ = n — n'. Without loss of generality, we assume 
that cells in C°^'^ and C"®™ are indexed from 1 to n — n' and n — n' + 1 to 
n, respectively. If cells Ci and Cj in C are assigned to different switches, then a 
handoff cost is incurred. Let fij be the frequency of handoff per unit time that 
occurs between cells Ci and Cj, {i, j = 1, ..., n) and is fixed and known. We assume 
that all edges in CG are undirected and weighted; and assume cells Cj and Cj in 
C are connected by an edge (ci,Cj) € L with weight Wij, where Wij = fij + fji, 
Wij = Wji, and wu = 0[3][4]. Let G{S,E) be the ATM network, where S is the 
set of switches with \S\ = m, E C S x S is the set of edges, Sk, si in S and (sfc, s;) 
in E and G is connected. We assume that the locations of cells and switches are 
fixed and known. The topology of ATM network G{S, E) is also fixed and known. 
Let be the coordinate of switch Sk, k = l,2,...,m, {Xa,Yc-) be the 

coordinate of cell Cj, z = 1, .2, ..., n; and dki be the minimal communication cost 
between the switches Sk and s/. Let Uk be the cost of cabling per unit time and 
between cell Cj switch Sk, (i = 1, ..., n; fc = 1, ..., m) and assume kk is the function 
of Euclidean distance between cell Cj and switch Sk, that is. 



hk = - X. J2 + (y,. - y, JA (1) 

Assume the number of calls that can be handled by each cell per unit time 
is equal to 1. Let Gapk be the number of remained cells that can be assigned to 
switch Sk- Our objective is to assign cells in C to switches so as to minimize 
(total cost) the sum of cabling cost and handoffs cost per unit time of whole 
system. 

To formulate this problem, let us define the following variables. Let Xik = 1 if 
cell Cj e C is assigned to switch Sk] Xik = 0, otherwise. Since each cell should be 
assigned to only one switch, we have the constraint Ywk=i = 1) for * = 1, n. 
Further, the constraint on the call handling capacity is 

n 

Xik < Gapk,k = 1, (2) 

2=n— n' + l 

Also, the sum of cabling costs is 



EE ^ikX^ik- (3) 

i—1 k—1 

To formulate handoff cost, variables Zijk = XikXjk, for i,j,= and 

k = l,...,m are defined in [5]. Thus, Zijk equals 1 if both cells Ci and Cj are 
connected to a common switch fc; otherwise it is zero. Further, let 

m 

dv = E = 1, n (4) 

fc=i 



Thus, yij takes a value of 1 if both cells Cj and Cj are connected to a common 
switch and 0 otherwise. With this definition, it is easy to see that the cost of 
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handoffs per unit time is given by [3] [4] 

n n m m 

EEEE Wij{l - Vij)xikXjidki (5) 

i=l j=l k=l 1=1 

This, together with our earlier statement about the sum of cabling costs, 
gives us the objective function [3] [4]: 

n m n n m m 

minimize : EE hkXik + cr EEEE Wij{l - yij)xikXjidki (6) 

i=l fc=l i=l j=l k=l 1 = 1 

where a is the ratio of the cost between cabling and handoff costs. 

The following assumptions will be satisfied: 

(1) We assume that the number of cells in C is less or equal to J2T=i CAPk . 
That is, there is no need for adding new switches into the ATM network. 

(2) The structures and locations of ATM network and PCS network are fixed 
and known. 

(3) Each cell in PCS network will be directly assigned and connected to only 
one switch in ATM network. 

(4) To simplify the discussion, we assumed that Capk > 0, for fc = 1, ..., m. 

Example 1. Consider the graph shown in Fig. 3. There are 14 cells in CG 

and they should be assigned to 4 switches in 5. In CG, cells are divided into 
two sets, one is the set C°*'^ of cells which are built originally and have been 
assigned to a switch in ATM network {e.g., {ci, C 2 , C 3 , C 5 } in Fig. 3); the other 
is the set (7”®“ of cells which are new cells ( e.g., {cq, cy, cg}) or split cell ( e.g., 
{cg, cio, cii, C 12 , Ci 3 , C 14 }). The weight of edge between two cells is the frequency 
of handoffs per unit time that occur between them. Four switches are positioned 
at the center of the cell: ci, C 2 , C 4 , and cg. 

3 Background of Genetic Algorithms. 

The Genetic Algorithm (GA) was developed by John Holland at the University 
of Michigan[7]. Genetic Algorithms are search techniques for global optimization 
in a complex search space. As the name suggests, GA employs the concepts of 
natural selection and genetic. Using past information, GA directs the search with 
expected improved performance. The concept of GA is based on the theory of 
adoption in natural and artificial systems[7j. In artificial adaptive systems, adap- 
tation starts with an initial set of structures (possible solutions). These initial 
structures are modified according to the performance of their solution by using 
an adaptive plan to improve the performance of these structures. It has been 
proved by Holland that repeatedly applying this adaptive plan to input struc- 
tures results in optimal or near optimal solutions [7]. The traditional methods 
of optimization and search do not fare well over a broad spectrum of problem 
domains [2]. Some are limited in scope because they employ local search tech- 
niques (e.g., calculus based methods). Others, such as enumerative schemes, are 
not efficient when the practical search space is too large. 
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3.1 Concept of GA 

The search space in GA is composed of possible solutions to the problem. A 
solution in the search space is represented by a sequence of Os and Is. This 
solution string is referred as a chromosome in the search space. Each chromosome 
has an associated objective function called the fitness. A good chromosome is the 
one that has a high/low fitness value, depending upon the nature of the problem 
(maximization/minimization). The strength of a chromosome is represented by 
its fitness value. Fitness values indicate which chromosomes are to be carried 
to the next generation. A set of chromosomes and associated fitness values is 
called the population. This population at a given stage of GA is referred to as a 
generation. The general GA proceeds as follows: 

Genetic Algorithm() 

Begin 

Initialize population; 
while (not terminal condition) do 
Begin 

choose parents from population; /* Selection */ 
construct offspring by combining parents; /* Crossover */ 
optimize (offspring); /* Mutation * / 
if suited (offspring) then 

replace worst fit (population) with better offspring; 
/*Survival of the fittest */ 

End; 

End. 

There are three main processes in the while loop for GA: 

(1) The process of selecting good strings from the current generation to be 
carried to the next generation. This process is called selection/reproduction. 

(2) The process of shuffling two randomly selected strings to generate new 
offspring is called crossover. Sometimes, one or more bits of a chromosome are 
complemented to generate a new offspring. This process of complementation is 
called mutation. 

(3) The process of replacing the worst performing chromosomes based on the 
fitness value. 

The population size is finite in each generation of GA, which implies that 
only relatively fit chromosomes in generation (z) will be carried to the next 
generation (z -|- 1). The power of GA comes from the fact that the algorithm 
terminates rapidly to an optimal or near optimal solution. The iterative process 
terminates when the solution reaches the optimum value. The three genetic 
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operators, namely, selection, crossover and mutation, are discussed in the next 
section. 

3.2 Selection / Reproduction 

Since the population size in each generation is limited, only a hnite number of 
good chromosomes will be copied in the mating pool depending on the fitness 
value. Chromosomes with higher htness values contribute more copies to the 
mating pool than do those with lower htness values. This can be achieved by 
assigning proportionately a higher probability of copying a chromosome that has 
a higher htness value [2]. Selection/reproduction uses the htness values of the 
chromosome obtained after evaluating the objective function. It uses a biased 
roulette wheel[2] to select chromosomes, which are to be taken in the mating 
pool. It ensures that highly ht chromosomes (with high htness value) will have 
a higher number of offspring in the mating pool. Each chromosome (i) in the 
current generation is allotted a roulette wheel slot sized in proportion (pi) to 
its htness value. This proportion pi can be dehned as follows. Let Ofi be the 
actual htness value of a chromosome (i) in generation (j) of g chromosomes, 
Sumj = J2i=i Ofi be the sum of the htness values of all the chromosomes in 
generation j, and let pi = Ofi/ Surrij. 

When the roulette wheel is spun, there is a greater chance that a better 
chromosome will be copied into the mating pool because a good chromosome 
occupies a larger area on the roulette wheel. 

3.3 Crossover 

This phase involves two steps: hrst, from the mating pool, two chromosomes are 
selected at random for mating, and second, crossover site c is selected uniformly 
at random in the interval [l,u]. Two new chromosomes, called offspring, are 
then obtained by swapping all the characters between positions c + 1 and n. 
This can be shown using two chromosomes, say P and Q. each of length n = 6 
bit positions 

chromosome P: 111|000; 
chromosome Q: 000 1 111. 

Let the crossover site be 3. Two substrings between 4 and 6 are swapped, 
and two substrings between 1 and 3 remain unchanged; then, the two offspring 
can be obtained as follows: 
chromosome R: 111 | 111; 
chromosome S: OOOjOOO. 

3.4 Mutation 

Combining the reproduction and crossover operations may sometimes result in 
losing potentially useful information in the chromosome. To overcome this prob- 
lem, mutation is introduced. It is implemented by complementing a bit (0 to 
1 and vice versa) at random. This ensures that good chromosomes will not be 
permanently lost. 
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(b) 



Fig. 5. (a) Cell-oriented representation of chromosome structure, (b) Cell- 
oriented representation of Example 1 . 



4 Genetic Algorithm for Cells Extended Assignment 
Problem. 

In this section, we discuss the details of GA developed to solve the problem of 
optimum assignment of cells in PCSs to switches in the ATM network. The de- 
velopment of GA requires: (1) a chromosomal coding scheme, (2) a chromosome 
adjustment procedure, (3) a genetic crossover operator, (4) mutation operators, 
(5) a fitness function definition, (6) a replacement strategy, and (7) termination 
rules. 



4.1 Chromosomal coding 

Since our problem involves representing connections between cells and switches, 
we employ a coding scheme that use positive integer numbers. Gells are labeled 
from one to n (the total number of cells), and switches are labeled from one to m 
(the total number of switches). The cell-oriented representation of chromosome 
structure is shown in Fig. 5(a), where the ith cell belongs to the Vith switch. 
Without loss of generality, we assume that cells in are indexed from 1 to 
n ~ n' and cells in are indexed from n — n' -I- 1 to n. For example, the 

chromosome of the example shown in Fig. 3 is shown in Fig. 5(b). It is worth 
noting that, the cell-oriented representation of chromosome structure can be 
divided into two sets, the first set of cells which represents the assignment of 
cells in is fixed in running of GA. It is worth noting that the first set of 
cells can be ignored since it is unchanged during experiments. For the reason 
of easily understanding, the fixed set of cells is still kept in chromosome in the 
reset of this paper. 

4.2 Chromosome Adjustment Procedure 

Since the initial population of our solution method is random generated and the 
operator of GA sometimes generates a chromosome which does not represent 
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a feasible assignment. This event is adjusted by means of the chromosome ad- 
justment procedure described below: Let Uk be the number of cells assigned to 
switch Sfc; three types of switches are defined: 

( 1 ) over- switch: if rik > Capk] 

( 2 ) saturated-switch: if nu = Capk', 

(3) poor-switch: if nu < Capk- 

Switches are grouped into sets Sover, Ssat, and Spoor for over-switch, saturated- 
switch and poor-switch, respectively. To change infeasible chromosomes into 
feasible ones, chromosome adjustment procedure is repeatedly used to reas- 
sign the cells from over-switches to poor-switches until all over-switches become 
saturated-switches. 

We have following algorithm: 

Algorithm: Chromosome Adjustment Procedure. 



Step 


1: 


Step 


2: 


Step 


3: 


Step 


4: 



Switches are grouped into sets Sover, Sgat, and Spoor according to 
the number of cells being assigned to it; without loss of generality, 
switches are renumbered such that rik > rik+i, k = 1, ...,m — 1 . 
Construct a set SP (switch pool) of number of switches by putting 
Capk-rik “k” into SP, if Uk < Capk, for k = l,2,...,m. 

Randomly generate a number as the adjustment point z in [n — n' -\- 
1, n], while Sover is nonempty do Step 4. 

li I = Vz e Sover, then randomly select and remove a number (say 
k) from SP; reassign cell Cz to switch Sk, i.e., set the value of Vz to 
k; decrease the ni by 1; if n; = Capi then move switch s; from Sover 
to Ssat- Otherwise, increase 0 by 1, if 0 > n then z = n — n' -\- 1. 



Example: For the example shown in Fig. 3, we assume Capk = 4, for k = 
1, 2, ..., m. The running results by applied chromosome adjustment procedure to 
this example is shown as follows: 

Before adjustment: Chromosome is generated as: 

22444 II 112131114 

After running steps 1 and 2, we have: Sover = {!}, Ssat = {4}, and Spoor = 
{2,3}. Since ns = 1 < Caps and U 2 = 3 < Cap 2 , three “3” and one “2” are put 
in SP, respectively; thus, SP = {2,3,3, 3}. Moreover, 



switch number 


1 1 


> 3 4 


# of cells being assigned (n^) 


6 1 


5 1 4 



When running Step 4, assume z= 11 is randomly generated. Since Vz G Sover, 
cell cii is assigned to switch S 3 which is random selected from SP. We have 

22444 II 112133114 

and 



switch number 


1 1 


> 3 4 


# of cells being assigned {rik) 


5 1 


5 2 4 



By repeatedly executing the Step 4, assume 3 is randomly selected from SP for 
cell C 12 . Finally, the chromosome is changed to 
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22444 II 112133314 



switch number 


1 f 


> 3 4 


# of cells be assigned [uk) 


4 ; 


5 3 4 



4.3 Genetic crossover operator 

Two types of genetic operators were used to develop this algorithm: 

(1) single point crossover, 

(2) cell-exchanging operator. 

The single point crossover is randomly selecting two chromosomes (say Pi 
and P 2 ) for crossover from previous generations and then by using a random 
number generator, an integer value i is generated in the range [n — n',n — 1]. 
This number is used as the crossover site. To create new offspring, the single 
point crossover consists of two stages: first, all characters between i and n of 
two parents are swapped and temporal chromosomes Ci and C 2 are generated. 
Then, chromosome adjustment procedure is to applied temporal chromosomes 
and modifies chromosomes to feasible chromosomes. 

The following example provides the detailed description of single point crossover 
operation: (Assume crossover site f = 9), 
parent Pi: 



parent P 2 : 



22444||112|13331 4, 
22444 II 333114112 1. 



First, two substrings between 9 and 14 are swapped, we have: 
temporal chromosome Ci: 



22444 II 112|14112 1, 
temporal chromosome C 2 : 



22444 II 333113331 4. 



Then, the Chromosome Adjustment Procedure is applied to change temporal 
chromosomes Ci and C 2 , to Oi and O 2 as follows: (assume adjustment points 
are 10 and 13 for Ci and G 2 , respectively.) 
offspring Op. 



22444 II 112|14332 1, 

offspring O 2 : (assume “1” and “2” are selected from SP for ce, cy, respectively) 

22444 II 123113331 4. 
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In cell-exchanging operator, two cells in (7”®“ of a chromosome are randomly 
selected, and the assigned switches of two cells are exchanged. 

For example, before exchanging, we have: 

22444 II 11213332 4. 

(Assume that the two cells cg and cn are selected.) After exchanging: 

22444 II 11313232 4. 



4.4 Mutations and Heuristic Mutations 

Six types of mutations used to develop of this algorithm and the active proba- 
bilities of mutations are the same. 

(1) Traditional Mutation (TM): randomly select a cell of vector Vi, where 
i in [n ~ n' l,n] and transform to a random number between 1 to m. After 
the transformation, the chromosome may became a infeasible one, thus, the 
Chromosome Adjustment Procedure must be applied to the chromosome. 

(2) Multiple Cells Mutation (MCM): randomly select two random numbers 
k, I between 1 and m, transform the value of cells which value is fc to i and I 
to k. The following example provides the detailed description of multiple cells 
mutation: (assume random number fc = 3 and I = 2) before mutating, we have 

22444 II 1 1313231 4. 

After mutating, we have 

22444 II 112123214. 

After the transformation, the chromosome may become an infeasible one; thus, 
the Chromosome Adjustment Procedure must be applied to the chromosome, 
(assume adjustment point is 11) we have 

22444 II 11212331 4. 

(3) Heaviest Weight First Preference (HWFP)[3|: Since the handoff cost 
involving only one switch is negligible, two cells can be assigned to the same 
switch so as to reduce the handoff cost between these cells. Two cells with higher 
weight Wij should have a higher probability of being assigned to the same switch. 
Thus, if we consider two connected cells Cj and Cj G C, then the probability of 
mutation from Vi of cell q to the value Vj of cell Cj is as follows: 

p. ^ 

(tj) \-^degree(ci) ’ 

Ei=iJ2j=i Wij 

where degree{ci) is the number of cells connected to cell Ci in CG. 

(4) Minimal Cabling Cost First Preference (MCCFP)[3|: To reduce the 
cabling costs between cells and switches, we prefer to assign each cell to the 
nearer switch rather than the farther one. Cell Ci and switch Sk with lower 
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cabling cost kk should result in higher probability that Cj will be assigned to 
Sfc. Thus, if we consider the randomly selected cell Cj, then the probability of 
mutation from Vi of cell c, to the value Vk is : 

Lmax lik 

^(i,k) = _ 7 \ > 

where Lmax = 

(5) Minimal Fixed Handoff Cost First Preference (MFHCFP): Given m 
sets of cells P = {Pi}, I = 1,2,..., m, we assume Pi U T2 U ... U Pm = C°^‘^ and 
Pi n Pj = 4>, where i ^ j. For example, in Fig. 3, P2 ={ci, C2}, P4 ={03, C4, 
C5}) -P3 = -Pi = 0- Without loss of generality, we assume that the cell in set Pj 
is assigned to switch Sj, j = 1, .., m. Let sid{ci) = I, if Cj in Pi where I is called 
the sid of cell Cj. Let LUCS{i, 1) = J2\/c ePi 

For example, since ci € P2 and C2 £ P2, thus sid{ci) = sid{c 2 ) = 2, 
LUCS{2,2) = = 4, LUCS{2,4) = Evc,eP.^2j = 1^23 + 

u, 24 + W25 = 3 -f 8 -f 0 = 11. p)(2, 4) = LUCS{2, 4) - LUCS{2, 2) = 11 - 4 = 7. 

To evaluate the effect of cell Cj in G"®™ being assigned to switch Sk, we must 
compute the cabling cost and location update cost derived from this event. By 
the definition described above, the cabling cost is kk- The location update cost 
has two components, one is the location update cost between cell in G”®™ and 
cell in G°^®*; the other is the location update cost of two cells in G"®*". Since the 
cell in C°^‘^ has been assigned to switch in ATM, if Cj in G”®“ is assigned to Sk, 
the location update cost between cell Cj in G”®™ and cell cj in G°*®* is fixed and 
can be computed by 

A-ik = a {LUCS{i,l)-dki),{i = n — ri +l,n — n' + 2, = 1,2, 

Vs,es,iiik 

For the example shown in Fig. 3, if cg is assigned to S3 and a = 1, then 
Ag3 = LUCS{6, 1) X dsi + LUCS{6, 2) x ^32 + LUCS{6, 4) x ^4 = 0 x 30 -f 0 x 
20 -f (2 4) X 30 = 180. 

Thus, consider cell Cj £ g"®“, the probability of mutation from sid{ci) to k 
is 

j-f ^max ^ik 

Mi , k ) — , _ 4 .,'1 ’ 

/ ^/ = i \-^max ) 

where Amax = msx'lf^{Aii}. 

(6) Minimal Estimated Flandoff Cost First Preference (MEHCFP): Since 
cells in G”®“ have not yet been assigned to switches, the location update cost of 
two cells £ G”®“ cannot have a determinative formula to compute. To estimate 
the location update cost between a £ Cnew and the other cell in G”®™, let 
avgDISTk = dki/{fn — 1) be the average distance between switch Sk and 

the other switch, avgLUi = 1'^ be the average location update cost 

between Ci and the other cell Cj in G”®™. It is worth noting that if two cells be 
assigned to the same switch then the handoff cost between two cells is ignored. 
If cell Ci in G”®“ is assigned to switch Sk and the capacity of switch Sk is 
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Capk, i.e., if all cells are assigned to switches, then at most n' — CAPk cells 
may be computed the location update cost with cell Ci which assigned to Sfe. 
Let NLi = X^Vc 7^ 0 be the number of cells in (7”®“ which the 

frequency of handoff between Ci and Cj is not zero. The total location update 
cost between Cj which assigned to Sk and cells assigned to another switches can 
be estimated by 

D _ r« X (n' — CAPk) X avgLUi x avgDISTk, ifCapk < NLi 
^ 0 otherwise 

If Capk > NLi, then NLi cells can be assigned to same switch Sk, that is, Bik 
is set to 0. 

For example, if Cg £ (7”®“ is assigned to switch S3 in Fig. 3, then avgDIST^ = 
23.33, avgLUQ = {wer + wqs)/Q = 0.556. Assume Caps = 2, at most 7 cells in 
(jnew computed the location update cost with cell Cj. Thus, Bes=a x 

7 X 0.556 x 23.33 = 90.728. 

Thus, consider cell Cj G (j^ew ^ probability of mutation from vih to k is 

Bmax Bi}^ 

^il) 



where Bmax = 



4.5 Fitness function definition 

Generally, genetic algorithms use fitness functions to map objectives to costs to 
achieve the goal of an optimally designed two-level wireless ATM network. If cell 
Ci is assigned to switch Sk, then Vi in the chromosome is set to k. Let be 

the minimal communication cost between switches Sk and s; in G. An objective 
function value is associated with each chromosome, which is the same as the 
fitness measure. We use the following objective function: 

n n n 

minimize kv- + a Wijd(^y^^„.y 

i—1 2=1 j — 1 



4.6 Replacement strategy 

This subsection discusses a method used to create a new generation after crossover 
and mutation is carried out on the chromosomes of the previous generation. 
Several replacement strategies have been proposed in the literature, and a good 
discussion can be found in [2] . The most common strategies probabilistically re- 
place the poorest performing chromosomes in the previous generation. The elitist 
strategy appends the best performing chromosome of a previous generation to 
the current population and thereby ensures that the chromosome with the best 
objective function value always survives to the next generation. The algorithm 
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developed here combines do both the concepts maintained above. Each offspring 
generated after crossover is added to the new generation if it has a better ob- 
jective function value than both of its parents. If the objective function value 
of an offspring is better than that of only one of the parents, then we select a 
chromosome randomly from the better parent and the offspring. If the offspring 
is worse than both parents then each of the parents is selected at random for the 
next generation. This ensures that the best chromosome is carried to the next 
generation, while the worst is not carried to the succeeding generations. 

4.7 Termination rules 

Execution of GA can be terminated using any one of the following rules: 

Rl: when the average and maximum fitness values exceed a predetermined 
threshold; 

R2: when the average and maximum fitness values of strings in a generation 
become the same; or 

R3: when the number of generations exceeds an upper bound specified by the 
user. 

The best value for a given problem can be obtained from a GA when the 
algorithm is terminated using R2[2] . 

5 Experimental Results. 

In order to evaluate its performance, we have implemented the algorithm and 
applied it to solve problems that were randomly generated. The results of these 
experiments are reported below. In all the experiments, the implementation lan- 
guage was conducted in C, and all experiments were run on a Windows NT 
with a Pentium II 450MHZ GPU and 256MB RAM. We simulated a hexagonal 
system in which the cells were conhgured as an H-mesh. The handoff frequency 
fij for each border was generated from a normal random number with mean 100 
and variance 20. The performance of the algorithm was evaluated based on the 
speed of improving the fitness of its solution and based on its complexity. The 
time complexity of our genetic algorithm is 0{{n? + m?)npg), where, g is is the 
maximum number of generations, and Up is the population size of the genera- 
tion. To examine the effect of the mutation probability of genetic algorithms, we 
set n = 500, n' = 3n/4, m = 50, Cap = 50, a = 1, population size (popsize) 
= 50, crossover probability (T’c)=l-0, maximum number of generations =1000, 
and mutation probability is selected from {0.001, 0.005, 0.01, 0.02, 0.05, 0.1, 
0.2, 0.5}. The experimental results were shown in Fig. 6. We found that when 
the mutation probability is small (0.001) or large (0.5), the total cost starts to 
improve later and may get trapped in a local minimum. We find that in general, 
a moderate value of the mutation probability leads to the best performance. 

To examine the effect of the population size, we set n = 500, n' = 3n/4, 
m = 50, Cap = 50, a = 5, mutation probability {Pm) =0.05, maximum number 
of generations =1000, and population size varied from 20 to 100 (with gap 20). 
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The effect of mutation probability 
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Fig. 6. The effect of the mutation probability with n = 500. 



We find that population size should be kept high (80 or 100) in order to get 
good performance of the algorithm. 

To examine the performance of algorithm, we simply constructed an algo- 
rithm named NSF(Nearest Switch First) [4], which is a two-phase heuristic algo- 
rithm. In the first phase of NSF, cells are assigned to the switch that is nearest. 
If the nearest switch is full, then another near switch is tried. In the second 
phase, two cells, which are assigned to different switches with the greatest re- 
duced total cost, are selected and exchanged; this process is repeatedly used to 
reduced the total cost until the total cost cannot be reduced. Figures 6 and 7 
also show the comparing results of algorithms in n = 500. Observe the results of 
GA in Fig. 6 and Fig. 7, in first 200 generations, the total cost rapidly decreases. 
After running 1000 generations, the improvement is near 34.6%. 
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Fig. 7. The effect of the population size with n = 500. 



6 Conclusions. 

In this paper, we investigate the cells extended assignment problem which op- 
timum assignment new added and split cells in PCS (Personal Communication 
Service) to switches in a wireless ATM network. This problem is currently faced 
by designers of mobile communication service and in the future, it is likely to 
be faced by designers of personal communication service (PCS). Since more and 
more users may use the PCS communication system. Some areas, which have not 
been covered in the originally designing plan may have mobile users to traverse 
on. The services requirement of some areas which covered by original cells may 
be increased and the capacities of the original cells may be exceeded. Though, 
the wireless ATM system must be extended such that the system can provide 
higher quantity of services to the mobile users. Two methods can be used to ex- 
tend the capacities of system and provide higher quantity of services. The first 
one is: several cells or base-stations (BSs) are built and added to the system such 
that the non-covered areas in the original wireless ATM network can be cover. 
The other is: in the cellular radio extending process, the capacity of a system 
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may be increased by reducing the size of the cells so that the total number of 
channels available per unit area is increased. In practice, this is achieved by the 
process of cell splitting[6]. 

Since finding an optimal solution of extended cell assignment is NP-Complete, 
a stochastic search method based on a genetic approach is proposed to solve it. 
Simulation results showed that genetic algorithm is robust for this problem. In 
our methods, cell-oriented representation is used to represent the cell assign- 
ment; three general genetic operators - selection, crossover, and mutation - were 
employed. Chromosome adjustment method is proposed to adjust chromosome 
to represent a feasible solution and find the fitness of chromosome. Two types 
of operator (single point crossover and cell-exchanging) and six types of mu- 
tation (TM, MCM, HWFP, MCCFP, MFHCFP, and MEMCFP) are employed 
in our method. Experimental results indicate that the algorithm run efficiently. 
The total cost of GA rapidly decrease in first 200 generations and get better 
performance than NSE algorithm. 
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Abstract. The ambient calculus of Cardelli and Gordon is a process 
calculus for describing mobile computation where processes may reside 
within a hierarchy of locations, called ambients. The dynamic semantics 
of this calculus is presented in a chemical style that allows for a com- 
pact and simple formulation. In this semantics, an equivalence relation, 
called spatial congruence, is defined on the top of an unlabelled transition 
system. 

We show that it is decidable to check whether two ambient calculus pro- 
cesses are spatially congruent or not. This result is based on a natural 
and intuitive interpretation of ambient processes as edge-labelled un- 
ordered trees, which allows us to concentrate on the subtle interaction 
between two key operators of the ambient calculus, namely restriction, 
that accounts for the dynamic generation of new location names, and 
replication, used to encode recursion. The result of our study is the defi- 
nition of an algorithm to decide spatial congruence and a definition of a 
normal form for processes that is useful in the proof of important equiv- 
alence laws. 



1 Introduction 

Algebraic frameworks, of which process algebras are one of the most prominent 
examples, have proved to be a valuable mathematical tool to reason about the 
behaviour of distributed and communicating systems. Recently, Cardelli and 
Gordon have proposed a new process algebra, the ambient calculus [3], for de- 
scribing systems with mobile computations. 

In the ambient calculus, processes may reside within a hierarchy of locations, 
called ambients. Each location is a cluster of processes and sub-ambients that 
can move as a group. 

Ambients provide an interesting abstraction that combines, within the same 
theoretical framework, three essential notions: mobile computation, site and mo- 
bility. Mobile computations are computations that can dynamically change the 
place where they are executed and are continuously active before and after move- 
ment, like it is the case with agents. Sites are the location where these computa- 
tions happen, like processors or routers. Finally, mobility represents a modihca- 
tion in the sites topology that occurs, for instance, with mobile or temporarily 
disconnected computers, and in the crossing of administrative boundary, like 
applets crossing a firewall. 



J. He and M. Sato (Eds.): ASIAN 2000, LNCS 1961, pp. 88-103, 2000. 
(c) Springer- Verlag Berlin Heidelberg 2000 
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Inspired by Berry’s and Boudol’s Chemical Abstract Machine model [2] and 
Milner’s “chemical” presentation of the 7r-calculus [13], the dynamic semantics 
of the ambient calculus is based on a spatial congruence relation, denoted =, on 
which the reduction system is based. Spatial congruence identihes processes up 
to elementary spatial rearrangements and allows a simple and compact presen- 
tation of the reduction rules in which the sub-processes having to interact - the 
redexes in A-calculus terminology - appear in contiguous position. 

This paper reports a proof that spatial congruence, one of the simplest and 
most basic equivalence between processes, is decidable. That is, the problem of 
checking whether two processes are spatially congruent, or not, is decidable. 

To prove the decidability of spatial congruence, we use a natural and intuitive 
interpretation of ambient processes as edge-labelled unordered trees. This allows 
us to concentrate on the subtle interaction between restriction and replication, 
two key operators of the ambient calculus. Roughly speaking, restriction acconnts 
for the dynamic generation of fresh location names and replication is used to 
encode recursive behaviours. 

The result of our study is twofold. First, we define an effective decision pro- 
cedure to test spatial congruence. This procedure is based on basic algorithmic 
on trees and can be easily implemented. Second, we define a normal form for 
processes and a proof method that demonstrate useful in the proof of important 
equivalence laws. 

The decidability result presented in this paper is useful in many respects. 
Since spatial congruence plays a central role in the definition of the operational 
semantics, any attempt to provide a mechanical proof of semantics-based prop- 
erties will rely on a formal study of spatial congruence and an implementation of 
a test for equivalence of processes. Interesting examples of semantical properties 
include proof of equivalences or validity of program transformations. 

Another application of our result is the study of the modal logic for ambi- 
ents [4], where spatial congruence is used in the definition of the satisfaction 
relation. The decidability of spatial congruence is essential in the proof that 
model checking, for a particular subset of the logic, is decidable. 

The outline of the remainder of this paper is as follows. Section 2 introduces 
the syntax of the ambient calculus and the definition of spatial congruence, and 
Section 3 defines an interpretation of processes as a certain kind of edge-labelled 
trees, called spatial trees. Section 4 studies a very simple notion of equivalence 
between spatial trees. We prove that this equivalence is decidable and we define 
a procedure to test the equivalence of spatial trees. This result relies on the 
existence of a computable normal form. In Section 5, we relate spatial trees to 
processes and tree equivalence to spatial congruence. Then, by transferring the 
results obtained on spatial trees, we prove the decidability of spatial congruence. 
Before concluding, we use our results to prove some interesting equivalence laws. 
Complete definition of the calculus and omitted proofs may be found in a long 
version of this paper [6]. 
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2 The Ambient Calculus 

The following tables summarize the syntax of processes and the definition of 
spatial congruence. For the sake of simplicity, we consider a minimal version of 
the untyped ambient calculus that includes only mobility primitives, as defined 
in [3], Section 2. In the extended version of this paper [6], we show that the results 
and algorithms presented here can be smoothly extended to the full ambient 
calculus. 

The operators of the ambient calculus can be separated into two categories: 
spatial constructs, which describe the “spatial configuration” of processes, and 
temporal constructs, which describe their, possible, dynamic behaviours. 

Spatial constructs are composed of restriction, void, composition and repli- 
cation, which are commonly found in process calculi, and include an original 
constructor, n[P], called an ambient. In the minimal ambient calculus, temporal 
constructs are only composed of actions, act n.P, where n is an ambient name 
and P is a process. In the full ambient calculus, temporal constructs also in- 
clude the input and output operators defined in [3], which are missing from our 
presentation. As pointed out in [4], this separation is similar to the distinction 
between static and dynamic constructs made in CCS [12]. 



Capabilities and processes: 



act n ::= 


capability 


in n 


can enter n 


out n 


can exit n 


open n 


can open n 


P, Q, R : := 


processes 


{vn)P 


restriction 


0 


void 


P 1 Q 


composition 


IP 


replication 


n[P] 


ambient 


aet n.P 


action 



In a restriction, [i'n)P, the name n is bound with scope P. The set of free 
names occurring in a process P, written fn{P) is defined as follows, where re- 
striction is the only binders. We identify processes up to consistent renaming of 
bound names. 

Free names, fn{P), of process P-. 

fn{{vn)P) = fn{P) \ {n} fn(0) = 0 

HP I Q) ^ HP) U HQ) H'-P) = HP) 

fn{n[P]) = {n}Ufn{P) fn{actn.P) = {n}Ufn{P) 



The rules defining spatial congruence can also be separated in different cat- 
egories. The first two categories of rule state that it is an equivalence relation 
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and a congruence. The third category states that parallel composition is an as- 
sociative and commutative operator with identity element 0. Another category 
specifies properties of replicated processes, \P, which acts like an infinite par- 
allel composition of replicas of P. The last category describes scoping rules for 
the restriction operator, {i/n)P, used to model the dynamic generation of new 
ambient names. 



Spatial congruence: P = Q 



1 

p = p 


(Struct Refl) 


Q = P ^ P = Q 


(Struct Symm) 


P = Q,Q = R^P = R 


(Struct Trans) 


P = Q ^ {vn)P = {i/n)Q 


(Struct Res) 


P = Q ^ {P \ R) = {Q \ R) 


(Struct Par) 


P = Q ^ \P=\Q 


(Struct Repl) 


P = Q ^ n[P] = n[Q] 


(Struct Amb) 


P = Q ^ actn.P = actn.Q 


(Struct Action) 


P \ Q = Q\ P 


(Struct Par Comm) 


{P 1 Q)\R = P\{Q 1 R) 


(Struct Par Assoc) 


P \G = P 


(Struct Par Zero) 


\{P 1 Q) = \P 1 \Q 


(Struct Repl Par) 


!0 = 0 


(Struct Repl Zero) 


\P = P \ \P 


(Struct Repl Copy) 


\P=\\P 


(Struct Repl Repl) 


{vn){vm)P = {vm){un)P 


(Struct Res Res) 


(i/n)0 = 0 


(Struct Res Zero) 


n ^ fn{P) => {vn){P \ Q) = P \ {vn)Q 


(Struct Res Par) 


n ^ m ^ [vn)m[P] = m[(un)P] 


(Struct Res Amb) 


n ^ m ^ {vn)act m.P = act m.{vn)P 


(Struct Res Action) 



Almost every rule in the spatial congruence definition has an equivalent in 
the corresponding yr-calculus equivalence, called structural congruence. The most 
significant differences lies in the axioms for replication, (Struct Repl Par) and 
(Struct Repl Repl), that are missing in the traditional definition of structural 
congruence [13]. As a matter of fact, these axioms are also missing in the seminal 
presentation of the ambient calculus [3] , where the relation = is also called struc- 
tural congruence. These differences have motivated our change in terminology. 

Intuitively, the structural congruence relation of the tt and ambient calculi 
should be decidable relations. But the author is not aware of any proof that 
structural congruence is decidable (or undecidable!) and these results seem very 
difficult to obtain. 

The rules added to spatial congruence are similar to the rules proposed in [7,8] 
to extend the standard definition of structural congruence in the 7r-calculus. In 
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these papers, the authors proved that the resulting equivalence is decidable. 
Another related work is [10], where Hirschkoff independently proposed a similar 
extension to structural congruence and proved the decidability result using a 
more algorithmic approach. We go back to these results in Section 7, where we 
review related works. 

Since the definition of the operational semantics is not needed in our study, 
we omit the definition of the reduction relation from the presentation. The reader 
interested in a thorough introduction to the ambient calculus is referred to [3]. 



3 Spatial Trees 



We define an interpretation of spatial processes as a certain kind of edge-labelled 
unordered trees, which we name spatial trees. A spatial tree will represent the 
hierarchy defined by ambients nesting, using the traditional notion of hierarchy 
defined by sub-trees. In our intuition, edges stands for ambients and are tagged 
with an ambient name, nesting stands for ambient encapsulation and, following 
our analogy, parallel composition of processes naturally arises as trees sharing the 
same root. Since it is not possible to define a process containing an unbounded 
number of nested ambients, we will only consider finite-depth trees. 

For convenience, and to avoid confusion, we use a distinct category of names, 
called markers, to model restricted ambient names. Markers are ranged over by 
X, y, . . . We use p to denote a name, n, or a marker, x. We use K,L, . . . to denote 
sets of names, and X, Y, Z . . . for sets of markers. 

A multiplicity, fi, is either 1 or oo. A cone, C, is either the empty vector, 
written e, an action: pactp.T, or an edge: prj\T] or !X.T, where T is a spatial 
tree and X is a non-empty set of markers. 

A spatial tree is a finite vector of cones, ■ --i-Ck, also written k 

The + operator is commutative and associative, with identity element e; spatial 
trees are identified up to these equations. 



Cones and spatial trees: 



/i ::= 

1 

00 
C ::= 



jiact rj.T 
m[T] 
lX.T 
S,T ::= 

Cl + --- + Ck 



multiplicity 

single 

infinite 

cone 

empty vector 
action 

edge tagged rj 

replicated edge with markers X 
spatial trees 

vector of cones 



Cones are a special type of spatial trees. The cone !X.T represents an infinite 
copy of the tree T such that, in each copy, the elements of X are replaced with 
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fresh markers. In an edge, !X.T, the markers in X are bound with scope T. Spatial 
trees are identified up to consistent renaming of bound markers. 

Free markers, fm{T), of tree T: 

/m(e) = 0 fm{lX.T)^fm(T)\X 

fm{fj,n[T]) = fm{T) fm{fiactn.T) = fm(T) 

/m(/rx[T]) = fm{T) U {x} fm{fi,actx.T) = fm{T) U {x} 

fm{S + T) = fm{S) U fm{T) 



We write T {n<— x} for the capture-avoiding substitution of the marker x for 
each occurrences of the name n in the tree T . For convenience, we extend the 
replication constructor, !X.T, to the empty set of markers as follows: 

!0.e = e 

\0.IJLactrj.T = ooactrj.T 
\0.^ir|[T] = oor][T] 

10.1X.T = lX.T 

\0.{S + T) = \0.S+\0.T 

Proposition 3.1. We have \0.\0.T = \0.T . 

Since we have a notion of free and bound markers, we can define a notion of 
connected tree, that is, tree whose sub-trees share mutual markers. 

Connected trees: 

I 1 

A tree p connected if and only if there are no partitions of l..p into 

two non-empty subsets, /, J, such that Ci) Cj) = 0. 



Using this definition, we can compute for each tree the (unique) set of its 
connected sub-trees as follows. For all tree T = p Q we can construct a 

graph as follows. 

(1) Let N be the set of cones {Ci, . . . , Cp}. 

(2) Let Q be the graph with nodes in M and edges between nodes that have at 
least one common free marker. 

(3) Compute the connected components of the graph Q, say C/i, . . . , Qk- 

The connected parts of T, written conn{T), is the set {Ti, . . . , Tk} such that for 
all i e l..k the spatial tree Ti is the vector of the cones included in Qi. Basic 
properties of the connected components of a spatial tree are: 

Proposition 3.2. If{Ti,... ,Tp} is the set of connected components, conn(T), 
of a tree T then T = Ti ■ -f Tp, and for each j € l..p the tree Tj is connected. 
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4 Equality of Spatial Trees 

We define a reduction relation between trees, X \- S ^ T, parameterised by 
a set of markers, X, called the effect of the reduction. This reduction relation 
captures the essential intuitions of the equivalence between edge-labelled trees 
and every rule in its definition corresponds to basic axioms of spatial congruence. 
For instance, rule (Red Zero) implies that “empty cones can be forgotten” and 
corresponds to rule (Struct Par Zero) of structural congruence. Likewise, rule 
(Red Add Edge) implies that “two infinite copies of a sub-tree can be replaced 
by only one copy” and corresponds to rule (Struct Repl Copy). We also define 
the equivalence induced by in almost the same way the A-calculus reduction 
relation induces /3-equivalence. 

In this section, we prove that every spatial tree can be factorised to an irre- 
ducible form, also called a normal form, which is not related to the reduction 
sequence used to compute it. Normal forms provide us with a unique represen- 
tative for each tree and, more significantly, allow us to define a formal procedure 
to test the equivalence of trees, a key result in the proof of the decidability of 
spatial congruence. 

Reduction: X \- S ^ T 

(Red Zero) (Red Add Edge) 

0 \- T + e ^ T 0h oor/[T] -|- firj[T] — > oo77[T] 

(Red Add Action) 



0 h ooact rj.T + jiact rj.T ooact rj.T 
(Red Add Repl) (Red Copy) 



0 h !X.T+ !X.r ^ !X.T 


X h \X.T + T -> \X.T 


(Red Sub) 
X h T ^ S’ 


X C Y 


(Red Repl) 

XhT^S (Z = Yn/m(S’)) 


Y h T ^ 


■ S 


X\ Y h !Y.T ^ !Z.S 


(Red r]) 

X h T ^ S 


(r/^X) 


(Red -I-) 

XhT^S (/m(R)nX = 0) 


X h fip[T] - 


. iarj[S] 


X^T + S + R 


(Red Action) 
X h T - 


S 




X h pLact rj.T - 


jiact Tj.S 





The rules for reduction can be separated in two categories. Rules (Red Zero) 
to (Red Copy) that involve two cones, or critical pairs, of which only (Red Copy) 
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can extend the effect. Rules (Red Repl) to (Red Action), the structural rules, 
which states that the relation ^ is compositional. 

In a reduction, X \- S —>■ T, the effect X records the markers that must not 
appear free in the result of the reduction. 

We can derive an equivalent of rules (Red Add Repl), (Red Copy) and (Red 
Repl) , for the special case where the set X is empty. 

- 0 h \0.T + \0.T \0.T. 

- 0 I- \0.T + T l0.T. 

- If 0 h T ^ 5 then 0 h \0.T l0.S. 

Next, we define the equivalence relation on spatial trees induced by 

Equivalence relation between trees: S ~x T and S kT 

The relation ~x is the smallest reflexive, symmetric and transitive relation such 
that if X h 5 — > r then S ~x T. The relation « is such that S' « T if and only 
if there exist two finite injective mappings, cri,cr2, and a set of markers X such 
that dom{(7i) = fm{S) and dom{a 2 ) = fm{T) and Scri ~x Ta 2 - 



From the structural rules of — it is trivial to show that ~x is a congruence 
and that if Y C X then C '^x ^ «. Basic properties of ~ are: 

Proposition 4.1. The relation w satisfies the congruence properties, that is, if 
{fm{S) U fm{T)) n fm{R) = 0 and S Pi T then S + RpiT + R; if SkT then 
pn[S] Pi pn[T]; if S Pi T then pactrj.S « pactrj.T. 

We prove that the reduction relation on spatial trees is locally confluent. 

Lemma 4.2. //Xi h T ^ Ti and X2 h T ^ T2 then there exists a tree S such 

that Xi U X2 h Ti S and Xi U X2 h T2 S. 

Proof. By induction on the derivation of Xi FT Ti. For the sake of brevity, 
we only consider the cases in which the two reductions originate from critical 
pairs that share a common cone. The complete proof can be found in [6]. 

In the particular case considered here, the tree T must be a composition, 
Ri + C + R2 + T', where Xi \- Ri + C ^ Si and Ti = Si + Rj + T' for each 

i e {1,2} and i 7^ j. These must have been derived from (Red +) and therefore 

we have the side condition (*) /m(i?2 + T') n Xi = fm{Ri + T') n X2 = 0. 

The proof follows by a case analysis on the rules used to derive the two 
reductions Xi h + C ^ and X2\~ R2 + C ^ 82- 

(Red Zero)- (Red Zero) Then C = e and Ti = T2 = Ri + R2 + S. Trivial. 
(Red Add Edge)-(Red Add Edge) Then Ri = pir][R], R2 = p2V[R]j = 
S2 = oor][R\ and C is a cone pr][R\ for some multiplicities p\, p2, T such that 
00 e (m, Mi} for each i e {1,2}. Trivial. Case (Red Add Repl)- (Red Add 
Repl) is similar. 
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(Red Add Edge)-(Red Copy) Then C is an edge iiri[R] and it must be the 
case that R\ = jiir}[R\ and R2 = !X.jU?7[R], where jj, or is an infinite 
multiplicity and X C X2. This case is impossible since (Red Copy) implies 
that fm{rj{Ri)) n X2 7^ 0, which conflicts with the side condition (*). Case 
(Red Copy)- (Red Add Edge) is similar. 

(Red Add Repl)-(Red Copy) Then T = R + IX. R + IX. R + T' and 5i = 
R + lX.R and S2 = IX. R + lX.R, where Xi = 0 and X2 = X. By (Red Copy) 
and (Red -I-), we have that X b Ti ^ \X.R + T' . By (Red Add Repl) and 
(Red -I-), 0 h T2 — > \X.R + T', as required. 

(Red Copy)-(Red Copy) Then C = !X.i? for some tree R and sets of markers 
X, such that RijX^Xi} = i?2{X^X2} = R and (Xi UX2) n/m(T') = 0. By 
(Red Copy) and (Red -f), we get that X2 b Ti ^ \X.R + T' and Xi b T2 ^ 
\X.R + T', as required. □ 

Since it can be proved that the reduction relation is decreasing, in the sense 
that the number of symbols in the definition of a tree decreases after a reduction, 
there can only be a finite number of reductions from any tree and we have: 

Theorem 4.3. The relation is strongly normalizing and confluent. 

We can define an algorithm to decide the equivalence of spatial trees based 
on this result. To decide if ~x S2, you compute the normal form of and 
S2, that is, the spatial trees S’(, S'2 such that X b S’i S[ and S[ is irreducible 
for each i e 1..2. By Theorem 4.3, these trees exist and can be computed using 
a finite number of reductions. Then, you verify whether the normal forms are 
equal. 

Theorem 4.4. The equivalences ~x and w are decidable. 

Proof. To decide if ~x S2, you compute the normal form of and S'2, that 
is, the spatial trees S(, S2 such that X b Sj — >* S' and S( is irreducible for each 
i e 1..2. By Theorem 4.3, these trees exist and can be computed using a finite 
number of reductions. Then, you verify whether the normal forms are equal. 
This amount to test the equality of trees up to the renaming of bound markers 
and the associativity-commutativity of +. Since this is a decidable problem, we 
get that ~x is decidable. 

To decide if Si w S2, you test whether Si<ti ~x S2CT2 for each finite injective 
mapping ai,a2 and for each set X such that such that dom{a\) = fm{S\) and 
dom{a2) = fm{S2) and X C fm{Siai) ^ fm{S2<J2)- R is sufficient to consider 
mappings a\, a2 that have their image in a fresh set of markers that has the 
cardinality of /m(Si) U/m(S2). Since the sets fm{S2) and /m(Si) are finite, and 
since ~x is decidable, we get that w is decidable. □ 

Using the strong normalization property, we can define a notion of normal 
form for trees. For all spatial trees T, there is a tree T' such that T psT' and: 

T' = Mu%[7ii]+ Y '■'^i 2 -Ti 2 + Y Ti^actm^.Ti^ 



(4.1) 
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Where (1) I\, I 2 and I 3 are finite and pairwise disjoint sets of indices; (2) 
for all indices i G U^ei 3 normal form; (3) for all i,j G /i, if 

Vi = Vj then Ti Tj or jii = jij = 1; and (4) for all i, j G I 2 , if IXj.Tj 

then i = j. 

It is worth mentioning that, contrary to a typical situation with normal 
forms found in other theoretical frameworks, the normal form given in (4.1) is 
syntactically smaller than the spatial trees associated with it. 

5 Relation Between Trees and Processes 

We define the tree semantics of processes, that is, a mapping from ambient 
processes to spatial trees, and we relate spatial congruence with the equivalence 
on spatial trees. Then, by transferring the decidability result obtained in the 
previous section, we infer the decidability of spatial congruence. This semantics 
extends a similar definition given in an extended version of [4] for a calculus 
without name restriction. 

In the definition of the tree semantics of processes, we use a new operation 
on trees called exponentiation, exp{T), obtained as the outcome of replicating 
every connected part of T. More formally, the exponentiation of a tree T, is 



the composition !Xi.Ti + ■•■ + IXp.Tp where {Ti,...,Tp} = eonn{T) are the 
connected parts of T and X, = fm{Ti) for each i G l..p. 

Tree semantics: 

|0] = e (Zero) 

lactp-P] = lactv-lPj (Action) 

ln[P]j = InllPj] (Amb) 

|!P] = expilPj) (Repl) 

MIPJ) n fmilQ]) = 0 ^ [P I Q] ^ IP] + IQ] (Par) 

X ^ fmjlPj) => [{nn)Pj = |P]{n^x} (Res) 



In the same way tree composition, S + T, corresponds to parallel composi- 
tion for processes, exponentiation is the analogue of replication. Furthermore, 
it is possible to prove properties of this derived operator corresponding to rules 
(Struct Repl), (Struct Repl Par), (Struct Repl Repl) and (Struct Repl Copy) 
respectively. 

Proposition 5.1. 

(1) If S pe T then exp{S) w exp{T). 

(2) If fm{S) n/m(T) = 0 then exp{S + T) pe exp{S) + exp{T). 

(3) The function exp{.) is idempotent: exp{exp{T)) = exp{T). 

(4) For all spatial trees T we have exp{T) + T pe exp{T). 

Using these properties, it is easy to prove that the axiomatisation of spatial 
congruence is sound. 
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Lemma 5.2. If P = Q then |P] w [Q]- 

Next, we prove the completeness of our axiomatisation. We start by defining 
an inverse mapping from trees to processes. 

Process semantics of trees: 



([e]) = 0 (Empty) 

([lact n.T]) = act n.([T]) (Action 1) 

([ooact n.T]) = \act n.([T]) (Action oo) 

([ln[T]]) ^ n[([TD] (Edge 1) 

([oon[r]]) = ln[([T])] (Edge oo) 

([!{xi,... ,Xp}.T]) = \{nni). . .{iynp)(lT{xi^ni} . . . {xp<^Up}]) (Repl) 

where {ni, . . . ,Up} is a set of pairwise distinct names not free in ([T]). 

_([5 + r])^([5]) I ([T]) (Sum) 



The composition of the two interpretations ([.]) and |.J differs from the identity 
over processes. Eor instance, we have ([|((/m)OJ]) = ([|0]]). Nonetheless, we can 
draw a simple relation between a process and the meaning of its interpretation. 
See Proposition 5.3 (2) below. 

Let the meaning of a tree T, written mean{T), be the process (t^AT)([Tcr]), 
where it is a bijection from fm{T) to a set of fresh names and K is a{fm{T)), 
the image of cr. Properties of mean{.) are: 

Proposition 5.3. (1) If S ^ T then mean{S) = mean{T) and (2) for all pro- 
cesses P we have mean(|P]) = P. 

Let P and Q be two processes such that [P] w |Q]. By Proposition 5.3 (1), 
mean(lP]) = mean{lQj). By Proposition 5.3 (2), P = mean{\P\) and Q = 
mean{\Q\). Hence, by transitivity of spatial congruence, P = Q. This proves 
that our interpretation of processes as spatial trees is complete, that is: 

Lemma 5.4. If |P] » |Q] then P = Q. 

Lemmas 5.2 and 5.4 state a full abstraction result between ambient processes 
and spatial trees with respect to the equivalences = and « respectively. There- 
fore, every problem in the ambient calculus can be expressed in terms of problem 
on spatial trees. For instance, to decide whether P = Q, a possible method is to 
compute |P] and [Q] and to verify if they are equivalent. By Theorem 4.4, this 
problem is decidable. It follows that: 

Theorem 5.5. The relation = is decidable. 

Using our interpretation of processes as spatial trees, we obtain another result 
for free. Indeed, through Lemma 5.4 and the normal form for spatial trees given 
in Section 4, we obtain a normal form for ambient processes that is unique up 
to very simple spatial transformations, that is, commutativity-associativity of 
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the parallel composition and the reordering of restrictions. If L is a finite set 
of names {ni, . . . , Up}, we write (vL)P for the process (yni) . . . {vnp)P. For all 
processes P, there is a process P' such that P = P' and: 

P' ~ I 11*26/2 [0*2] I 11*36/3 (5 1) 

I 11*46/4 ^^*4 -Qu I 11*56/5 

Where (1) the set of indices /i, . . . , /s are finite and pairwise disjoint; (2) for 
all i e Uj6i 5 processes Qj are in normal form; (3) for all i e p,j e /2, 

if rii = rij then Qi ^ Qj; (4) for all i,j e I 3 , if (vLi)Qi = {vLj)Qj then i = j. 

6 Applications 

We can apply the results given in this paper to prove interesting equivalence 
laws like, for example, the one listed in Lemma 6.1 below. The laws examined 
in this section are particularly interesting because they are, at the same time, 
very useful in the formal study of the ambient calculus, and very difficult to 
prove directly, that is, for example, using an induction on derivations of the 
form P = Q. 

In the particular example of Lemma 6.1, we study three equivalence laws 
extracted from the presentation of Cardelli’s and Gordon’s modal logic for am- 
bients [4], a logic used to describe properties of processes. These laws are essential 
to prove the soundness of several axioms of the logic. 

An interesting fact is that we follow a similar proof technique in each case. 
We start by using the full abstraction result obtained in Section 5 to restate 
the problem in terms of equivalence between spatial trees, then, we prove the 
desired equivalence by exhibiting a property invariant by the reduction relation 
over trees. 

Lemma 6.1. 

(1) // P I Q = 0 then P = 0 and Q = 0. 

(2) If n[P] = Q \ R then either Q = n[P] and R = 0, or Q = 0 and R = n[P]. 

(3) If m[P] = n[Q] then m = n and P = Q. 

Proof. We only sketch the proof for case (1). Proofs for the other cases are 
similar and can be found in [6]. 

By the full abstraction result stated in Section 5, this problem is equivalent 
to prove that for every spatial trees. S', T, if S' -I- T « e then S ~ e and T « e. 
By Theorem 4.3, since e is an irreducible spatial tree with no free markers, this 
is also equivalent to prove that if S -b T e then S e and T -^* e. 

The proposition follows by showing that for any finite set of cones, (Ci)*g/, 
if X h X]*g/ ^ then X h C* e for all i e I. This can be proved by an 

easy induction on the derivation of X h J2iei 

Now, assume P | Q = 0. By Lemma 5.2, |P] -b |Q] « e. Hence, there exists 
a set X such that |P] -b |Q] ~x £■ By Lemma 4.2, and since e is an irreducible 
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spatial trees, we get that X h |_PJ + |Q] e, and therefore, X h |P] e and 
X ^ [Q] Hence, |H] w e and |Q] w e. By Lemma 5.4, P = Q = 0. □ 

Next, we prove three equivalence laws that validate the distribution of name 
restriction over void, ambient, and parallel composition. These laws play an 
essential role in the definition of axioms for an extension of the ambient modal 
logics with an operator for name restriction [5]. 

Lemma 6.2. 

(1) If {vn)P = 0 then P = 0 

(2) If {un)P = m[<5] then there exists R such that P = m[R] and Q = {vn)R. 

(3) If{vn)P = Q\R then there exist two processes, Pi, P 2 , such that P = Pi\ P 2 , 
and Q = {vn)P\, and R = {vn)P 2 - 

Proof. Proof of (1) is similar to the proof of Lemma 6.1 (1) sketched above. 
In particular, we use the property that for any finite set of cones, (Ci)iG/, if 
X h £ then X\- Ci — >* e for all i e I. 

For (2), assume {vn)P = m[Q]. By Lemma 5.2, |(j/n)P] « |m[Q]]. Therefore, 
for every fresh marker, x, we have |P] {n^x} w lm[|Q]] . By definition of w, there 
exist two finite injective mappings, ci, <72 and a set X such that iHjcriln^y} ~x 
lm[|(5]cr2] where y = cri(x). Let S be the normal form of |P]cri. Therefore, 
S « |P]cti '^y lnr[|(3]o-2{y<— n}]. Since S is in normal form, it must be the 
case that S = lm[T] for some tree T such that T « [Qlcr 2 {y<— n}. Let R be 
the process mean{T). Then, w 5 w |P] and, by Lemma 5.4, m[R] = P. 

Moreover, l{nn)Rj w T{n^y} w |Q]. By Lemma 5.4, {nn)R = Q, as required. 

For (3), Assume {vn)P = Q | i?. By Lemma 5.2, |Q | i?] w \{vn)P\. 
Therefore, for every fresh marker, x, we have |Q] + [i?] ~ x}, where 

fm{\Q\) n/m(|i?]) = 0. By definition, there exist two hnite injective mappings, 
CTi, (T 2 and a set X such that I-P]cr 2 {n^y}, where y = (T 2 (x). 

Let S, T and O be the normal forms of |(5]cri, |i?]cri and [P]ct 2 respectively. 
Hence, S + T ~y 0{n^y} for some set of markers Y such that X C Y and 
with the side condition: fm{S) nfm{T) = 0. Assume p the, common, 

normal form of S' + T and 0{n<— y}. Since S,T and O are normal forms, there 
exist three families of spatial trees in normal form, (Si)igi,.p, (Tj)igi. p, and 

(Oi)igi,.p, such that: 

(1) S = and T = and O = E*gi..pO*. 

(2) Si + Ti ~Y Oi{n^y} for each i G l..p. 

(3) y h Si + Ti — >* Ci and Y h Oi{n<^y} — >* Ci for each i e l..p. 

The proof follows by constructing the spatial trees corresponding to the 
processes P\,P 2 . We proceed by defining two families of trees, (S')iGi,.p and 
(T/)jgi..p, and proving that 0^ ~y (S' + T/), and S^ ~y S'jn^y}, and p ~y 
T/{ n^y} for each i e l..p. The trees S' and T' are defined by case analysis on 
the definition of Ci. 

(Empty) Then Ci = e. Since S, T and O are in normal form, it must be the 
case that Si = p = Oi = e. Let S- = T' = e. Trivial. 
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(Action) Then C, = jiact rj.S' . Since Oi is in normal form, it must be the case 
that Oi{n^y} = jjLactrj.S'. Let S[ = S'ily^n} and T/ = Tijy^n}. Trivial. 
We follow the same definition for the cases where Ci is an edge. 

(Repl) Then Ci = !Y'.T'. Since Oi is in normal form, it must be the case that 
Oi{n^y} = lY'.T" + T"' and T' ~yuY' T" . Since Si and Ti are in normal 
form and fm{Si) nfm{Ti) = 0, it must be the case that either (1) Si ~y Q 
or (2) Ti Ci- Assume we are in case (1). Let S[ = {Si + T'"){y<— n} and 
TI = T,{y^n}. Then 5({n^y} ~y Q + T"' ~y Q ~y S,, and (5( + T[) = 
{Si + Ti + T'"){y^n} ~y (Ci + T'"){y^n} ~y O*, as required. 



An easy induction on the definition of X^iei p P^'O'^ss that J2iei p 

and Eiei..p^i '^y Eiei..p and O -y Eiei..p(E' + 

TI). Let Pi and P 2 be the processes mean{J2i(zi pS'i) and mean{J2i(zi pTi) 
respectively. Hence, [P] w O ~y [Pi] + IP 2 I and, by Lemma 5.4, P = P\ \ Pj. 
Moreover, l{vn)Pi} w 'L^el..pS^ ~ [Cl and \{vn)P 2 \ » Ei6i..p?i ~ [-^1- By 
Lemma 5.4, {un)Pi = Q and {vn)P 2 = R, as required. □ 



Given three processes, P, Q and R, such that {vn)P = Q \ R, we define 
a solution of Lemma 6.2 (3) to be a couple (Pi,P 2 ) such that P = Pi | P 2 , 
Q = {i/n)Pi, and R = {vn)P 2 - For example, the next equations give a solution 
to a non-trivial instance of (3) obtained by following the steps described in the 
proof of Lemma 6.2. 

{vn) (!(j/n)n[0] | n[0]) = \{i'n)n[0] \ \{vn)n[0] 
p Q R 

= {vn) {l{i'n)n[0] \ n[0]) \ {i/n) \{Rn)n[0] 

Pi P2 

It is not clear how to prove Lemma 6.2 (3) without using spatial trees as 
an intermediate representation, and it is even less clear how to obtain solutions 
for this law. Therefore, it is interesting to note that, following the constructive 
approach taken in this paper, our proof not only demonstrates that there is 
always a solution, but also describes an algorithm to compute it. 



7 Discussion 

We propose an algorithmic method to decide whether two ambient processes are 
spatially congruent, or not. This method is based on an intuitive interpretation 
of processes as edge-labelled trees, and a strongly normalizing rewriting system. 

The definitions and proof techniques defined in this paper can easily be trans- 
posed to other process calculi equipped with a chemical semantics, such as the 
TT-calculus for instance, and natural candidates for comparison are [7] and [10]. 
Other examples of calculi amenable for the same study include the spi-calculus 
of Abadi and Gordon [1] and some process calculi of concurrent objects, like 
TyCo [14] and concij [9]. 
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Our definition of spatial congruence is very similar to the definition of the 
TT-calculus equivalence given in [7]. Hence, we obtain a new proof of decidability 
for structural congruence. A major difference with Engelfriet’s and Geselma’s 
work is that we propose a more direct approach, and define an algorithm to 
decide the equivalence of processes. 

In the work of Hirschkoff [10], the decidability of structural congruence is 
proved using a rewriting system, as it is the case in this paper. There are two 
main differences with Hirschkoff ’s approach. First, we use an intermediate data 
structure, the spatial trees, which eliminate the need to explicitly manipulate 
the associative and commutative parallel composition operator. Second, we use 
an exponentiation function in the interpretation of processes. These two differ- 
ences should result in a more efficient algorithm. Another distinguishing feature 
of our work is the definition of an effective technique for proving equivalence laws. 

The results obtained in this paper are interesting because they lay the formal 
basis for the development of an algorithm to check spatial congruence. Such au- 
tomatic tool for testing the equivalence of processes is a necessary component in 
machine-based verification of properties of the ambient calculus. A benefit of the 
algorithm obtained with our approach, which has been successfully implemented 
by Remain Kervarc and Daniel Hirschkoff [11], is that it is based on well-studied 
algorithmic over trees, such as associative-commutative tree unihcation. 

Another interest of our study is given in Section 6, where we apply our 
theoretical framework to the proof of equivalence laws used in the definition of 
Cardelli’s and Gordon’s modal logic for mobile ambients [4,5]. 
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Abstract. The trajectory of moving objects in video data plays an im- 
portant role in video indexing for content-based retrieval. In this paper, 
we propose a new spatio-temporal representation scheme for modeling 
moving objects’ trajectories in video data. In order to support content- 
based retrieval on video data very well, our representation scheme con- 
siders the moving distance of an object during a given time interval 
as well as its temporal and spatial relations. Based on our representa- 
tion scheme, we present two similarity measures for both the trajectory 
of a single moving object and those of multiple moving objects, which 
provide ranking for the retrieved video results. Finally, we show from 
our experiment that our representation scheme achieves about 10-20% 
higher precision while it holds about the same recall, compared with its 
competitors, such as Li’s and Shan’s schemes. 



1 Introduction 

Multimedia database systems have recently become a critical research area of 
computer systems because so many applications are required to deal with mul- 
timedia data such as image, audio and video. These applications include digital 
libraries, advertisements, video on demand (VOD), digital broadcasting, cyber 
museum, and electronic commerce. Because they generally handle a very large 
number of multimedia data, it is necessary to support content-based retrieval 
on multimedia data themselves[WNl][PHl][JSl][TGl][VEl]. In video data, the 
trajectory of a moving object plays an important role in video indexing for 
content-based retrieval. The trajectory can be represented as a spatio-temporal 
relationship between moving objects, including both their spatial and tempo- 
ral properties[NFl][SAl][HSl][GDl][AMl]. User queries based on the spatio- 
temporal relationship are as follows: ''Finds all objects whose motion trajectory 
is similar to the trajectory shown in a user interface." or ” Finds all shots with 
such a scene as two cars approach to each other.” 

To handle the queries, there have been many studies on temporal relation- 
ship and spatial relationship between moving objects in video data. The studies 
on the temporal relationship are based on thirteen temporal relations proposed 
by Allen[JFl] while those on the spatial relationship[SQl] [JYl] are based on 
topological and directional relations using spatial coordinates. While most of 
the studies have concentrated on spatio-temporal relationships between moving 



J. He and M. Sato (Eds.): ASIAN 2000, LNCS 1961, pp. 104-118, 2000. 
(c) Springer- Verlag Berlin Heidelberg 2000 




A Spatio-temporal Representation Scheme for Modeling Moving Objects 105 



objects, they did not consider their moving distance during a given time interval. 
For modeling moving objects’ trajectories in video data, it is necessary to con- 
sider their moving distance as well as their spatio-temporal relationships so as to 
support content-based retrieval on video data very well. For example, in case of 
soccer game video data, it is very important to decide whether the trajectory of 
a soccer ball belongs to ” a long pass” or ” a short pass” . This can be determined 
by using the moving distance of the ball in a given time interval, rather than 
using the direction of it. 

In this paper, we propose a new spatio-temporal representation scheme for 
modeling moving objects’ trajectories in video data. In order to support content- 
based retrieval on video data very well, our representation scheme considers 
the moving distance of an object during a given time interval as well as its 
temporal and spatial relations. Based on our representation scheme, we present 
two similarity measures for both a single moving object’s and multiple moving 
objects’ trajectories, called SDST (Similarity measure based on moving Distance 
for Single object’s Trajectories) and SDMT (Similarity measure based on moving 
Distance for Multiple objects’ Trajectories). 

This paper is organized as follows. In Section 2, we introduce related work in 
the area of content-based video retrieval using spatio-temporal relationships. In 
Section 3, we propose a new spatio-temporal representation scheme for modeling 
moving objects’ trajectories. In Section 4, we describe new similarity measure 
algorithms to calculate the similarity between a user query and moving objects 
in video databases. In Section 5, we compare the performance of onr scheme 
with those of the Li’s and Shan’s schemes. Finally, we draw our conclusions and 
suggest future work in Section 6. 



2 Related Work 

There have been some researches on content-based video retrieval using spatio- 
temporal relationships in video data. First, when assuming a moving object is a 
salient one moving over time, Li et al.[JMl][JM2] represented the trajectory of a 
moving object as eight directions, such as North(NT), Northwest (NW), North- 
east(NE), West(WT), Southwest (SW), East(ET), Southeast(SE), and South- 
west(SW). They represented as (Sj, dj, k) the trajectory of a moving object 
A over a given time interval k where Si is the displacement of A and di is 
a direction. They also represented as A(a, f3, Ik )B the spatio-temporal re- 
lationships between moving objects A and B over time interval Ik- Here a is 
one of eight topological relationships: Disjoint(DJ), Touch(TC), Equal(EQ), In- 
side(IN), Coverd_by(CB), Contains(CT), Covers(CV), Overlap(OL). /3 is the 
directional relationship between moving objects A and B. Therefore, the spatio- 
temporal relationships between moving objects A and B can be represented as 
a list of motions, like A[(ai, /3i, k), ( 02 , /?2, I2), Pn, In)]B. Based on 

the representations for moving objects’ trajectories, they present a similarity 
measures to computes the similarity of spatio-temporal relationships between 
two moving object. Let {Mi, M 2 , ..., M^} (m > 1) be the trajectory of moving 
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Table 1. Distances of directional relations 
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Table 2. Distances of topological relations 
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object A,{Ni, N 2 , N„} be the trajectory of moving object B, and m < n. The 

similarity measure between the trajectory of object A and that of object B, Tra- 
jSim(A, B), is computed by using the similarity distances of directional(Table 
1) and topologicla relations (Table 2) as follows: 

m 

minDiff{A,B) = distonce(Mi, A^i_i_j)} (Vj,0 < j < n — m) 

i=l 



TrajSim{A, B) 



maxDif f{A, B) — minDif f{A, B) 
maxDif f{A, B) 



Here, minDiff(A, B) and maxDiff(A, B) are the smallest distance between A 
and B and the largest distance, respectively. When the moving direction of A is 
opposite to that of B in all the comparisons, maxDiff(A, B) = 4*m where the 
maximum number of comparing motions is m. Also, it considered only directional 
relationship to compute the similarity of a single object’s trajectory between 
video and query. 

Secondly, Shan and Lee [MSI] introduced similarity retrieval algorithms for 
both a single moving object’s and multiple moving objects’ trajectories in or- 
der to support content-based video retrieval. For retrieval based on the single 
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moving object’s trajectory, they represented the trajectory of a moving object 
as a sequence of segments, each being expressed as the slope ranging from 0 to 
360 degree. For the single moving object’s trajectory, they proposed two algo- 
rithms to measure the similarity between a query trajectory and moving objects’ 
trajectories in video data by using only directional property, i.e., OCM(Optimal 
Consecutive Mapping) and OCMR(Optimal Consecutive Mapping with Replica- 
tion). In order to represent the multiple moving object’ trajectories, they simply 
used the 2D string scheme proposed by Chang[SQl]. So, the multiple moving 
objects’ trajectories consist of a set of symbol objects, each being represented 
as a 2D string. However, in the 2D-string scheme, it is difficult to express the 
spatio-temporal relationships between moving objects precisely. 

3 New Spatio-temporal Representation Scheme 

Both Li’s and Shan’s schemes concentrated on spatio-temporal relationships be- 
tween moving objects, but they did not consider their moving distance during 
a given time interval. In order to support content-based retrieval on video data 
very well, it is necessary to consider the moving distance of objects in the video 
data as well as their spatio-temporal relationships. For this, we propose a new 
spatio-temporal representation scheme for modeling moving objects in video 
data which considers their moving distance during a given time interval as well 
as their temporal and spatial relations. In order to approximate the position of 
an object, we use MBR (Minimum Bounding Rectangle). We define a moving 
object as one whose position is changed over a given time interval and (xj, y^) as 
the center point of object A in XY-coordinates. So, the trajectory of the moving 
object A is represented as [(xq, yo, to), (xi, yi, ti),..., (x„, y„, t„)] at time to, 
to, ..., to- We define b as the difference between a start frame and a finish frame 
in a set of consecutive video frames. Here, R means a time interval between time 
ti_i and time tj, [ti_i, ti] as shown in Fig.l. First, the single moving object’s 
trajectory is defined as follows. 




Time Interval li=[^.i,tj] 

Fig. 1. MBR representation of moving object 



Definition 1. Let the motion (Mi) of a moving object A over time interval li he 
(ai, Di, li). Here, cXi is a direction being made from A at time U-i to A at time 
ti, which is represented by the real angle ranging from 0 to 360 degree. Di is the 
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relative moving distance (0 to 100) of A over which is computed as dividing 
the sum of moving distance between I\ and In by the moving distance during f 
and multiplying by 100. For a given order list of time intervals [Ii, h,---, In] , 
the trajectory of a single moving object A can be described by a list of motions 
[Ml, M2,... , Mm], i.e. 

[(ai, Di, h), (a2, D2, I2), , ((n, Dn, In)] 

Example 1. Fig. 2(a) shows a trajectory of object A. The trajectory for a single 
moving object A consists of a sequence of motions which is expressed by [(0°, 
15, II), (90°, 15, 12), (40°, 23, 13), (300°, 32, 14), (0°, 15, 15)] as shown in Fig. 
2(b). 




(a) Trajectory 



< (0", 15, U, (90", 15, U), (40", 23, Id, 
(300", 32, I 4 ), (0", 15, I;) > 




(b) Motion List 



Fig. 2. Trajectory and motion list of an object A 

In order to describe the spatial relationships between multiple moving ob- 
jects, we define seven topological operators being constructed from SMR scheme 
proposed by Chang[JYl], like Faraway(FA), Disjoint(DJ), Meet(ME), Over- 
lap(OL), Is_included_by(CL), Include(IN), Same(SA). Fig. 3(a) depicts the seven 
topological operators for a single dimension. Thus, topological relations on X- 
axis are shown in Fig. 3 is a topological operator between two objects A 

and B on i-axis as shown in Table 3. 




« I ' » I OverLap(OL) 



ls-inCLijded 43 y(CL) 







INdude(IN) 

SAme(SA) 



Fig. 3. Seven topological operators on X-axis 
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Table 3. Spatial relation on XY-coordinates 
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Secondly, the multiple moving objects’ trajectories are defined as follows. 



Definition 2. DMi is the relative moving distance (0 to 100) of an object A 
over li, compared with that of an object B. That is, DMi is 50 when the moving 
distance of A is the same as that of B while DMi becomes near to 1 00 as the 
moving distance of A is getting grater than that of B and conversely. 



Definition 3. Let the spatio-temporal relationships (STRi) from a moving ob- 
ject A to a moving object B over time interval li he (Ri, ai, DMi, h)- Here 
Ri is the spatial relation on XY-coordinates from A to B over R and ai is the 
direction from A to B at the start frame of li which is expressed as an angle 
with 0 to 360 degree. For a given ordered list of time intervals [R, h,---, In], the 
multiple moving objects’ trajectories from A to B can be described by a list of 
spatio-temporal relationships [STRi, STR 2 ,..., STRn], i.e. 

[(Ri, ai, DMi, Ii), (R 2 , 02 , DM 2 , I 2 ), ..-jiRn, CUn, DMn, In)] 



Example 2. Fig. 4(a) depicts that a Car(C) and a Motorcycle(M) are running a 
race. The multiple moving objects’ trajectories from C to M can be expressed 
by 



[(DJ, 260°, 30, Ii), (DJ, 290°, 50, I 2 ), (DJ, 45°, 55, I 3 )] 
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(a) Car and IVIotorcycle objects’ trajectory 



<(DJ, 260*, 30, li), (OL, 290*, 50, y, (DJ, 45*, 55, I 3 ) > 




(b) Spatio-temporal relationships from Car to Motorcycle 



Fig. 4. Multiple motion trajectories and their spatio-temporal relationships 



4 New Spatio-temporal Representation Scheme 

Most of similarity measure algorithms for spatio-temporal representation schemes 
mainly depend on only spatial relationships to calculate the similarity between 
a user query and moving objects in video databases. However, since the moving 
distance of an object during a time interval plays an important role in calculat- 
ing the similarity between a user query and moving objects more effectively, we 
propose new similarity measure algorithms to consider the moving distance of 
objects as well as their spatial relationships. The proposed similarity measures 
allow us to retrieve precise video results based on the moving distance of objects 
as well as to provide ranking for the retrieved video results to answer a user 
query. For a single motion trajectory, we first propose SDST(Similarity measure 
based on moving Distances for Single object’s Trajectories) in the following. 

Definition 4. For a single moving object’s trajectory F6'={ VS\, VS2, ■■■, VSm} 
in video databases and a query trajectory QS={QSi, QS2, ■■■, QSn} (1<N<M) 
, the difference between the angle of a video motion VSi and that of a query 
motion QSi, Dang (VSi, QSi), is defined as 

If|VSi - QSi I > 180° 

Da„g(VSi, QSi) = (360° - |VS, - QS,|) 
otherwise 

Da„g(VSi, QSi) = |VS, - QS,| 



Example 3. For a given video trajectory V and query trajectory Q, the difference 
between the angle of a video motion Vi and that of a query motion Qi, Dang (Vi, 
Qi), is as follows : 
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Da„g(Vi, Qi) = |15°-40| =25° 
Da„g(V 2 , Q 2 ) = (360° - |5° - 355|) = 10° 



Definition 5. Given a single video moving object’s trajectory VS={VS\, VS 2 , 
VSm} and a guery trajectory QS={QS\, QS 2 , ■■■, QSn} (1<N<M), the 
similarity between the direction of a video motion VSi and that of a query motion 
QSi, SRi(VSi, QSi), is defined as follows. Because cos(CF) = 1 and cos(18(P ) 
= -1, SRi(VSi, QSi) has a value between 0 to 1. 



SR,{VS^,QSi) 



COs{Dang{VSt,QSi)) + 1 
2 



Definition 6. Given a single video moving object’s trajectory VS={VS\, VS 2 , 
VSm) and a query trajectory QS={QS\, QS 2 , ■■■, QSn} (1<N<M), the 
similarity between the moving distance of a video motion VSi and that of a 
query motion QSi, SDi(VSi, QSi), is defined as 



SD^{VSi,QSi) = l- 



\pvs _pQS\ 

Max{DY^,Df) 



Definition 7. Given a single video moving object’s trajectory VS={VS\, VS 2 , 
..., VSm} and a query trajectory QS={QS\, QS 2 , ■■■, QSn} (1<N<M), the 
similarity between a video trajectory VS and a query trajectory QS, SDST(VS, 
QS), by using definition j and 5 is defined as follows. Here, u>i and u >2 mean the 
weight of the direction and that of the distance, respectively. 



SDST{VS,QS) = MAX{^=^ } (V^,0 < j < M-N) 

Secondly, we propose SDMT (Similarity measure based on moving Distance for 
Multiple objects’ Trajectories) for multiple moving objects’ trajectories. The 
SDMT computes the similarity based on the moving distance of objects as well 
as topological and directional relations in the following. 

Definition 8. Given a video trajectory of multiple moving objects VM={VM\, 
VM 2 , ..., VMm} and a query trajectory QM={QM\, QM 2 , ..., QMn} (1‘£N<M), 
the similarity of the topological relation between VMi and QMi, STi(VMi, QMi), 
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Table 4. Similarity distance between topological operators(SimJ3ist) 
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is defined as follows. Here, Fig. 5 shows the similarity distance graph between 
topological relaitons. Sim-Dist(VR\, VR 2 ) means the similarity distance between 
VRi and VR 2 as shown in Table f. STi(VMi, QMi) is reversely in proportion 
to the square of Sim-DistfVRi, VR 2 ). So, X is used to smooth the curve of 

st.(vmI qmJ. 




Fig. 5. Similarity distance graph 

Definition 9. Given a video trajectory of multiple moving objects VM={VM\, 
VM2, ..., VMm} and a query trajectory QM={QM\, QM2, ..., QMjq} (1<N<M), 
the difference between the moving distance of a video motion VMi and that of a 
query motion QMi, SDMifVMi, QMi), is defined as follows. Because 

is range from 0 to 100, SDMifVMi, QMi) has a value between 0 and 

1 . 

dmY^ - dmY^\ 



SDMYVM^QMi) = 1 



100 




A Spatio-temporal Representation Scheme for Modeling Moving Objects 113 



Definition 10. Given a video trajectory of multiple moving objects VM={VM\, 
VM2 , ■■■, VMm} and a query trajectory QM={QMi, QM2, QM^} (1<N<M), 
the similarity between a video trajectory VM and a query trajectory QM by using 
definition (4), (7) and (8), SDMT(VM, QM), is defined as follows. Here, wi 
and u)2 and W3 means the weight of topological relations, that of the direction, 
that of the distance, respectively. 



SDMT{VM, QM) = MAX{ 






N 






(V„0<j<M-iV) 



5 Performance Analysis 



In order to verify the usefulness of our spatio-temporal representation scheme, 
we do our experiment with the video data of soccer (football) games. Because 
users generally consider a soccer ball as a salient object in a soccer game video, 
we extract the trajectories of the soccer ball from the video data. Most of video 
data used in our experiment which are formatted as MPEG file (*. mpeg) include 
a shot of ” getting a goat’ . We extract the trajectory of a soccer ball by manually 
tracing the soccer ball in soccer filed. Fig. 6 and 7 show an example to extract 
the single trajectory of a soccer ball from the video data and an example to 
extract the multiple trajectories between a soccer ball and a player, respectively. 




(a) Original Video Frame 
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(b) Single object’s trajectory extracted from (a) 
Fig. 6. Example of single trajectory 
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(a) Original Video Frame 
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(b) Multiple objects’ trajectories extracted from (a) 
Fig. 7. Example of multiple trajectories 
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Table 5. Experimental data set 





Single object’s trajectory 


Multiple objects’ trajectories 


# of video shots in the data set 


350 


200 


7 ^ of motions in a shot 


2-15 


2-10 


# of query types 


40 


10 


^ of motions in a query 


2-3 


3-4 



For our experiment, we make use of the data set for single moving object’s 
trajectory and multiple moving objects’ trajectories as shown in Table 2, re- 
spectively. By considering possible ’’getting a goat’ trajectories as single moving 
object’s trajectory, we make forty query trajectories consisting of twenty in ” the 
right field’ and twenty in ” the left field’ from the half line of the soccer field. 
In addition, for multiple moving objects’ trajectories, we make ten query trajec- 
tories consisting of five in ” the right field’ and five in ” the left field’ from the 
half line of the soccer field. For smoothing Sim-Dist curve, we choose A = 10. 
We also do experiment by uii=cu2=0-5 for single moving object’s trajectory and 
wi=a;2=W3=0.33 for mulitple moving objects’ trajectories because their weights 
are assumed to be same. 

For our performance analysis, we implemented our spatio-temporal represen- 
tation scheme as well as Li’s and Shan’s schemes under Windows PC with 128 
MB memory by using Microsoft-Visual C-f-l-. We compare our scheme with the 
Li’s and Shan’s schemes in terms of retrieval effectiveness, that is, precision and 
recall measures[GSl]. Let RVR (Relevant Video data that are Retrieved) be the 
number of relevant video data retrieved by a given query, RVD (Relevant Video 
data in Database) be the number of relevant video data to the query by man- 
ual, and RVQ (Retrieved Video data by Query) be the total number of video 
data retrieved by the query. To compute RVD, we make a test panel which finds 
relevant video data manually from the database. The test panel is composed of 
10 graduate school students from our Computer Engineering department. The 
precision is defined as the proportion of retrieved video data being relevant and 
the recall is defined as the proportion of relevant video data being retrieved as 
follows. 



RVR 

Precision = 

iL V Cj/ 



Recall = 



RVR 

RVD 



For our performance comparison, we adopt the 11-point measure[SMl] which 
is most widely used for measuring the precision and the recall. Table 3 shows the 
average precision and the average recall values of our scheme, Li’s scheme and 
Shan’s scheme for single and multiple moving objects’ trajectories, respectively. 
In the single trajectory, our scheme is outperforms the Li’s scheme in terms of 
both precision and recall. That is, our scheme holds about 20% higher precision 
and about 10% higher recall. Our scheme also achieves 17% higher precision 
than the Shan’s scheme while it holds about the same recall. In the multiple 
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Table 6. Comparison of retrieval effectiveness 





1 Single object’s trajectory 


1 Multiple objects’ trajectories | 


Average Precision 


Average Recall 


Average Precision 


Average Recall 


Li’s scheme 


0.23 


0.42 


0.36 


0.50 


Shan’s scheme 


0.26 


0.46 


0.21 


0.39 


Our scheme 


0.43 


0.44 


0.45 


0.54 



trajectories, our scheme holds about 10% higher precision and about 5% higher 
recall than Li’s scheme. Our scheme also achieves about 20% higher precision and 
about 15% higher recall than Shan’s scheme. Fig. 8 shows the recall-precision 
graph of our scheme, Li’s one and Shan’s one. 




(a) Single trajectory 




(b) Multiple trajectories 

Fig. 8. Recall-precision graph for single and multiple trajectory 
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6 Conclusions 

For efficient content-based retrieval on video data, we proposed a new spatio- 
temporal representation scheme for moving objects in video data in order to 
represent a spatio-temporal relationship between moving objects more precisely. 
In addition, we proposed two similarity measures, called SDST and SDMT, which 
consider the moving distance of an object during a given time interval as well as 
its spatial relations so that we may calculate the similarity between a user query 
and moving objects more effectively. They allow us to retrieve precise video re- 
sults based on the moving distance of objects as well as to provide ranking for 
the retrieved video results to answer a user query. For our performance analysis, 
we implement our spatio-temporal representation scheme and compare it with 
Li’s and Shan’s schemes in terms of retrieval effectiveness. We finally show from 
our two experiment that for single object’s trajectory, our scheme achieves about 
20% higher precision while it holds about the same recall, compared with Li’s 
and Shan’s scheme. For multiple objects’ trajectories, our scheme achieves about 
10-20% higher precision while it holds about 5-15% higher recall. In our future 
work, it is necessary to prove the usefulness of our spatio-temporal representa- 
tion scheme by applying it to real applications dealing with a large number of 
soccer game video data. 
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Node-to-Set Disjoint Paths Problem in Rotator 

Graphs 



Keiichi Kaneko and Yasuto Suzuki 
Tokyo University of Agriculture and Technology, Tokyo 184-8588, Japan 



Abstract. In this paper, we give an algorithm for the node-to-set dis- 
joint paths problem in rotator graphs. The algorithm is based on recur- 
sion and it is divided into cases according to the distribution of desti- 
nation nodes in classes into which all the nodes in a rotator graph are 
categorized. The proof of correctness of our algorithm, the sum of the 
length of paths, and the time complexity are also given. 



1 Introduction 

As an unrestricted improvement in the performance of sequential computation 
is currently difficult to achieve, studies of parallel and distributed computation 
are becoming more significant. Moreover, research on so-called massively par- 
allel machines which have very large number of processing elements has been 
conducted enthusiastically in recent years. Hence many complex topologies of 
interconnection networks have been proposed to replace the simple networks 
such as a hypercube and a mesh[l,4,5,8,9,10]. Unfortunately, there still remain 
unknowns in several metrics for these topologies, making a clear comparison of 
them difficult. A rotator graph by Corbett [3] is one of the new topologies that 
shows promise in that it has a low degree and a small diameter in comparison 
with the number of nodes[2,12]. However, it is not yet cleared in some met- 
rics, amongst which is included the node-to-set disjoint paths problem: Given a 
source node s and a set D = {di, d 2 , • ■ • , dfc} (s ^ D) oi k destination nodes in 
a fc-connected graph G = (V,E), find k paths from s to di {1 < i < k) which 
are node-disjoint except for s. This is one of the most important issues in the 
design and implementation of parallel and distributed computing systems[6,ll]. 
In general, node-disjoint paths can be obtained by making use of the maximum 
flow algorithm in polynomial order of |U|. In an n-rotator graph, the number of 
nodes is equal to n!, so its complexity is not considered efficient. In this paper, 
we give an answer to this problem which is of polynomial order of n instead of 
n! and it is explained in detail with proof of its correctness. 

2 Preliminaries 

In this section, we first give a definition of a rotator graph, then give some 
comparisons between a rotator graph and other major network topologies for 
several elementary indices. 



J. He and M. Sato (Eds.): ASIAN 2000, LNCS 1961, pp. 119-132, 2000. 
(c) Springer- Verlag Berlin Heidelberg 2000 
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Definition 1. An n-rotator graph, is a directed graph which has n\ nodes. 
Each node has a unique label (ai, 02 , - ■ ■ , ««) comprised of a permutation of n 
figures: 1,2, ■■■ ,n. In addition, there exists an edge {a, b) between two nodes a = 
(tti, tt 2 , • ■ • , a„) and b — ( 61 , 62 , • ■ ■ , bn) if and only if there exists i (2 < i < n) 
such that bi = a 2 , b 2 = U 3 , • ■ • , 1 = ai, bi = ai, bi^i = ai-^i, • ■ • ,bn — an. Here, 

let Ri represent the operation to obtain the node b from the node a. 

An n-rotator graph contains n different (n — l)-subrotator graphs. All of 
the nodes in each subrotator graph share the same last figure k in their 

labels and the subrotator graph is specified by Pn-ik. Then any edge between 
two nodes which belong to different subrotator graphs Pn-\h and Pn-\k [h yf k) 
is given only by the operation Fig. 1 presents some examples of rotator 
graphs. 



12 1234 2341 




Fig. 1. Examples of rotator graphs. 



Table 1 shows the comparison of an n-rotator graph with an n-star graph 
and an n-cube. From this table, we can see that the n-rotator graph shows better 
performance against other two topologies. In addition, connectivities also means 
the number of paths which must be constructed in the node-to-set disjoint paths 
problem. 

Next, we define a class, which is a subset of nodes, and the node-to-set disjoint 
paths problem. 






Node-to-Set Disjoint Paths Problem in Rotator Graphs 
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Table 1. Comparison of a rotator graph with other topologies. 





n-rotator graph 


n-star graph 


n-cube 


number of nodes 


n! 


n! 


2^ 


number of edges 


(n — 1) X n! 


(n — 1) X n! 


n2^ 


diameter 


n—1 


L3(n-1)/2J 


n 


connectivity 


n—1 


n — 1 


n 



Definition 2. A class of an n-rotator graph is a set of nodes in which for any 
pair of nodes a and b, the node a is obtained from the node b by iterative appli- 
cation of the operation . 

The class to which a node a belongs is specified by C(a). The following are 
properties of classes. 

1. Each class has n nodes which form a directed ring structure. 

2. Every node in belongs to exactly one class. 

3. Every node in Pn has n — 1 parent nodes, all of which belong to different 
classes. 

Definition 3. The node-to-set disjoint paths problem in an n-rotator graph is 
to find n — 1 paths from a source node s to each node in the destination node set 
D = {di, (^ 2 , ■ • ■ , dn-i} which are disjoint except for s. 

3 Algorithm 

In this section we give an algorithm for the node-to-set disjoint paths problem 
and it is proved by induction with respect to n. Because of the symmetry of Pn-, 
we can fix the source node s to be (1, 2, • • • , n) without any loss of generality. 
For a 2-rotator graph P 2 , the problem is trivial and we assume that n > 2 in the 
following. Let D = {di, <^ 2 , • ■ • , d„^i} represent the set of destination nodes. Our 
algorithm is composed of procedures corresponding to the cases given below. 

Case I; There exists at least one single class with multiple destination nodes in 
Pn. 

Case II: Each class in has at most one destination node. 

Case II-l: The class C{s) includes exactly one destination node. 

Case II-2: C(s) has no destination node. 

Case II-2-A: All the destination nodes in P„ belong to Pn-in. 

Case II-2-B: At least one destination node in belongs to a subrotator 
graph other than Pn-in. 

Case II-2-B-a: Each subrotator graph of Pn other than Pn~in has 
at most one destination node. 

Case II-2-B-b: There is a subrotator graph of Pn other than Pn-in 
which includes mutiple destination nodes. 
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The following subsections present procedures for the leaf cases, that is, Case I, 
Case II-l, Case II-2-A, Case II-2-B-a and Case II-2-B-b, as well as proofs of their 
correctness. 



3.1 Case I 

In this subsection, we will consider the case that there exist multiple destination 
nodes in a single class. Let C = {Ci, C 2 , ■ ■ ■ , Ck} (fc < n — 1) be the collection 
of classes to which destination nodes belong. Using Procedure 1 below, we can 
construct n — 1 paths from the source node s to the n — 1 destination nodes in 
D which are disjoint except for s. 

Procedure 1 

1. Let Di be a node set constructed by choosing one destination node from each 
class in C such that the figure n occurs in the label of the node in the right 
most position compared to all the other destination nodes in the class. That 
is, any destination node in D\ is nearest from the subrotator graph Pn-in 
with respect to the operation R„ amongst the other destination nodes which 
belong to the same class as the destination. Additionally, let D 2 be the rest 
of the destination nodes D — Di. Then, \Di\ = k,\D 2 \ = n — k — 1. Without 
loss of generality, we can assume that D 2 = {di, ^ 2 , • • • , See Fig. 2. 

Pn-in A £>2 




Fig. 2. The sets Di and Z? 2 - 

2. For each node di in {di, d 2 , • ■ • , dn-k- 2 }, select a parent node c* satisfying 
the following two conditions in a greedy manner. Note that the node dn-k~i 
is not considered in this step because the probability of obtaining some gain 
due to its parent is very small in compared to the cost to be paid. 

Condition 1 : If i 7 ^ j, C{ci) ^ C(cj). 

Condition 2 : Vi, C[ci) ^ C. 

3. Let £1 <— U {ci|I < i < n — k — 2}, and C <— CU {C{ci)\l < i < n — k — 2}. 
Then, |C| = n — 2. 

4. (a) In the case that C(s) G C: (See Fig. 3.) 
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Fig. 3. Case (^(s) e C. 



Pji-in A A 




Fig. 4. Selection of a parent node Cn-k-i- 



i. For a node dn-k~i, select a parent node c„_fc-i of dn-k-i which 
satisfies the following condition in a greedy manner. 

Condition : C(c„_fe_i) ^ C. 

ii. Let Di ^ Hi U{c„_fe_i}, and C ^ CUC{cn-k-i) — C{s). See Fig. 4. 

iii. Let Vi (1 < i < n — 2) be nodes which belong to the classes in C and 
also belong to the subrotator graph Pn-in. 

iv. In Pn~in, obtain n — 2 paths from s to (1 < i < n — 2) which are 
disjoint except for s by calling the algorithm recursively. 

V. Establish paths from Vi {1 < i < n — 2) and s to corresponding 
nodes in Di within the classes. 

vi. Select edges Ci ^ di {1 < i <n — k — 1). See Fig. 5. 



Pn-in Di A 




Fig. 5. Establishment of paths. 
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(b) In the case that C(s) ^ C: (See Fig. 6.) 



A A 




i. Let Vi (1 < i < n — 2) be the nodes which belong to the classes in C 
and also belong to the subrotator graph 

ii. In Pn^in, obtain n — 2 paths from the node s to (1 < i < n — 2) 
which are disjoint except for s by calling the algorithm recursively. 

iii. Establish paths from Vi (1 < z < n — 2) to corresponding nodes in 
D\ within the classes. 

iv. Select edges Ci ^ di {1 < i < n — k — 2). See Fig. 7. 



Pn-in A A 




V. Let Pn-il be the subrotator graph which includes dn-k-i- 

vi. Let s represent a node which belongs to C(s) and Pn-il- Then we 
can select a path from s to the node s within the class C(s). 

vii. Let li (1 < i < n — 2) represent the nodes which belong to the classes 
in C and also belong to the subrotator graph Pn-il- 

viii. In Pn-\l, construct n—2 internally disjoint paths from s to [7]. 

Select a path s ^ dn-k~i among them which does not include none 
of n - 3 nodes {h.h, ■ ■ ■ ,ln~ 2 } - {dn-k-i}- See Fig. 8. 

Lemma 4. The n — 1 paths s di (I < z < u — 1) established in Procedure 1 
are disjoint except for s. Additionally, the sum of the length of paths established 
in Procedure 1 excluding those whieh are constructed by the recursive call of this 
algorithm is o/0(rz^). 
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Fig. 8. Establishment of paths. 



Proof. The proof is divided into two cases depending on whether C{s) is included 
in C or not. 

1. C{s) e C 

The paths selected in step 4(a)iv of Procedure 1 are disjoint except for s by 
the induction hypothesis. The n — 1 paths selected in step 4(a)v are known to 
be disjoint because of property 2 of classes. The paths selected in step 4(a)v 
and the edges in step 4(a)vi are apparently disjoint except for the nodes Ci 
(1 < i < n — k — 1). All the nodes which are used to construct the paths 
selected in step 4(a)iv have the figure n in the last position of their labels. In 
addition, all the nodes which are used to construct the paths selected in step 
4(a)v and the edges selected in step 4(a)vi have last hgures other than n in 
their labels except for (1 < i < n — 2) and s. Hence, the paths selected 
in step 4(a)iv and the paths and edges selected in steps 4(a)v and 4(a)vi are 
trivially disjoint. Therefore, if C(s) is in the collection C, the n—1 paths 
constructed by Procedure 1 are disjoint except for node s. 

In addition, the sum of the length of the n—1 paths established in step 
4(a)v is of 0(nf). The sum of the length of the n — k — 1 edges selected in 
step 4(a)vi is of 0{n). Summing up them, in the case of C{s) e C, the sum 
of the length of paths established in Procedure 1 excluding those which are 
constructed by the recursive call of the algorithm is of O(n^). 

2. C{s)^C 

Following similar reasoning as in the case of C{s) e C, the paths selected 
in steps 4(b)ii to 4(b)iv are disjoint except for s by using the induction 
hypothesis. The n — 2 paths selected in steps 4(b)ii to 4(b)iv and the edges 
selected in step 4(b)vi are known to be disjoint except for s because of 
property 2 of classes. The path s — > dn-k-i selected in step 4(b)viii and the 
paths selected in steps 4(b)ii to 4(b)vi are disjoint except for s. Therefore, if 
the class C{s) is not in the collection C, the n — 1 paths selected in Procedure 
1 are disjoint except for s. 

Moreover, the sum of the length of the n — 2 paths established in step 4(b)iii 
is of 0{'nf ). The sum of the length of the n — fc — 2 edges selected in step 
4(b)iv is of 0{n). In addition, the length of each path selected in steps 4(b)vi 
and 4(b)viii is of 0{n). Summing up them, in the case of C{s) ^ C, the sum 
of the length of paths established in Procedure 1 excluding those which are 
constructed by the recursive call of the algorithm is of O(n^). □ 
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3.2 Case II-l 

In this subsection, we will consider the case that there is at most one destination 
node in each class and there is a destination node in the class to which the source 
node s belongs. By Procedure 2 below, we can construct n — 1 paths from s to 
n — 1 nodes in D which are disjoint except for s. 

Procedure 2 

1. Let b be the destination node which belongs to the class C{s). 

2. Select paths within the class C{s) from s to b. 

3. Let Vi {1 < i < n — 2) represent the nodes in P„_in and also in the classes 
to which the n — 2 nodes in the set D — {b} belong. 

4. Obtain the paths from s to (1 < t < n — 2) which are disjoint except for 
s by calling the algorithm recursively. 

5. Select a path within the class from each node Vi {1 < i < n — 2) to the 
corresponding node in D — {b}. 



Lemma 5. The n — 1 paths, s ^ di (1 < i <n — 1), selected in Procedure 2 
are disjoint except for s. Additionally, the sum of the length of paths established 
in Procedure 2 excluding those which are constructed by the recursive call of this 
algorithm is ofO{n^). 

Proof. The paths selected in step 4 of Procedure 2 are disjoint except for s by 
the induction hypothesis. The paths selected in step 2 and the paths selected in 
step 5 are disjoint because of property 2 of classes. All the nodes on the paths 
selected in step 4 have the figure n in the last position of their labels. In addition, 
all the nodes which are used to construct paths selected in steps 2 and 5 have last 
figures other than n in their labels, except for s and Vi {1 < i < n — 2). Hence, 
the paths selected in step 4 and the paths selected in steps 2 and 5 are disjoint 
except for s and Vi (I < i < n — 2). Therefore, the n — 1 paths constructed in 
Procedure 2 are disjoint except for the source node s. 

Moreover, the length of the path selected in step 2 is of 0{n) and the sum 
of the length of paths selected in step 5 is of O(n^). Summing up them, the 
sum of the length of paths established in Procedure 2 excluding those which are 
constructed by the recursive call of the algorithm is of O(n^). □ 



3.3 Case II-2-A 

In this subsection, we will consider the case that there is at most one destination 
node in each class, there is no destination node in the class to which the source 
node s belongs and all the destination nodes belong to the subrotator graph 
Pn-in to which the source node s belongs. Procedure 3 below gives the n — 1 
paths from s to n — 1 destination nodes in D which are disjoint except for s. 




Node-to-Set Disjoint Paths Problem in Rotator Graphs 



127 



Procedure 3 

1. Let Di = {di\l < i < n — 2} and D 2 = 

2. Obtain n — 2 paths from s to n — 2 destination nodes in D\ which are 

disjoint except for the node s by calling the algorithm recursively. Here, if 
the destination node d„_i is on one of the paths obtained, say, a path from 
s to a node dk, then exchange the specifications of nodes d„_i and d^. 

3. Let V represent a node in P„_il which belongs to the class C(d„_i). 

4. Select an edge s — > (2, 3, ■ ■ ■ , n, 1). 

5. In establish the shortest path from (2, 3, ■ • ■ , n, 1) to v. 

6. Construct a path from v to the destination node d„_i within the class. 



Pn-in P„.ll 




Fig. 9. Case II-2-A. 



Lemma 6. The n — 1 paths s ^ di (1 < i < n — 1) selected in Proeedure 3 are 
disjoint except for s. Additionally, the sum of the length of paths established in 
Proeedure 3 excluding those which are constructed by the recursive call of this 
algorithm is ofO{n). 

Proof. The n — 2 paths selected in step 2 of Procedure 3 are disjoint except for 
s by the induction hypothesis. The nodes on the paths selected in step 2 have 
figure n in the last position of their labels. In addition, the nodes on the paths 
selected in steps 4 to 6 have figures other than n in the last position of their 
labels, except for s and d„_i. Hence, the paths selected in step 2 and the paths 
selected in steps 4 to 6 are disjoint except for s and Moreover, the node 

does not appear on the paths selected in step 2. Therefore, the n — 1 paths 
established in Procedure 3 are disjoint except for s. 

Moreover, the length of the edge selected in step 4 is of 0(1). The length 
of the shortest path in step 5 is of 0{n). In addition, the length of the path 
constructed in step 6 is of 0(n). Summing up them, the sum of the length of 
paths established in Procedure 3 excluding those which are constructed by the 
recursive call of the algorithm is of 0{n). □ 

3.4 Case II-2-B-a 

In this subsection, we will consider the case which is characterized as follows: 
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— there is at most one destination node in each class, 

— there is no destination node in the class to which the source node s belongs, 

— there are some destination nodes which do not belong to the subrotator 
graph Pn-iu, and 

— at most one destination node exists in a subrotator graph other than _P„_in. 

By Procedure 4 below, we can construct the n—1 paths from s to n—1 destination 

nodes in D which are disjoint except for s. 

Procedure 4 

1. Let D\ and represent the set of destination nodes in Pn~in and other 
destinations, respectively. Here, we can assume Z ?2 = {^i, <^ 2 , ■ ■ • , dk} (1 < 
k < n — 1) without loss of generality. 

2. Additionally, we can assume without loss of generality that the node dj. has 
the smallest figure in the last position in its label among the nodes in D 2 . 

3. Let each destination node di belong to the subrotator graph Pn-ik (1 < * < 

fc- 1). 

4. Select fc — 1 nodes (1 < t < fc — 1) in Pn~in which satisfy following 
conditions in a greedy manner. 

Condition 1 : The first figure of the label of Vi is k. 

Condition 2 : Vi ^ Di. 

5. Obtain n — 2 paths from s to Hi U {ui|l < i < fc — 1} which are disjoint 
except for s by calling the algorithm recursively. 

6. For each u, (1 < t < fc — 1), select u, ^ Rn{vi). 

7. For each Rn{vi) (1 < i < fc — 1), select a shortest path from Rn{vi) to di. 

8. Let Pn-il be a subrotator graph to which dk belongs. 

9. Let 5 be a node which is in the class C(s) and also belongs to Pn-il- 

10. Select a path from s to s within the class C{s). 

11. Establish a shortest path from s to dk- 



Lemma 7. The n — 1 paths s ^ di (l<t<n— 1) selected in Procedure 4 are 
disjoint except for s. Additionally, the sum of the length of paths established in 
Procedure 4 excluding those which are constructed by the recursive call of this 
algorithm is o/0(n^). 

Proof. The paths selected in step 5 of Procedure 4 are disjoint except for s by 
the induction hypothesis. The paths selected in steps 6 and 7 have the figure 
li in the last positions of their labels. Hence, they are disjoint. The nodes on 
the paths selected in step 5 have the figure n in the last position of their labels. 
Additionally, the nodes on the paths selected in steps 6 and 7 have figures other 
than n in the last positions of their labels except for u, (1 < z < fc — 1). Hence, 
the paths selected in step 5 and the paths selected in steps 6 and 7 are disjoint 
except for s and (1 < i < fc — 1). Similarly, the paths selected in steps 10 and 
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Pn-in P„_,l 




Fig. 10. Case II-2-B-a. 



11 and the paths selected in steps 5 to 7 are easily proved to be disjoint except 
for s. Therefore, the n — 1 paths established in Procedure 4 are disjoint except 
for s. 

Additionally, the sum of the length of the k — 1 edges selected in step 6 is 
of 0{n). the sum of the length of the k — 1 shortest paths selected in step 7 is 
of O(n^), and the sum of the length of the path selected in step 10 is of 0{n). 
Finally, the shortest path established in step 11 has of 0{n) length. Summing 
up them, the sum of the length of paths established in Procedure 4 excluding 
those which are constructed by the recursive call of the algorithm is of 0{n?). □ 

3.5 Case II-2-B-b 

Finally, in this subsection, we will consider the case which is characterized as 
follows: 

— there is at most one destination node in each class, 

— there is no destination node in the class to which the source node s belongs, 

— there are some destination nodes which do not belong to the subrotator 
graph Pn~in, and 

— there exist multiple destination nodes in a subrotator graph other than 
P„-in. 

Procedure 5 below constructs n — 1 paths from s to n — 1 destination nodes in 
D which are disjoint except for the source s. 

Procedure 5 

1. Select a path from s to the node of a subrotator graph which has multiple des- 
tination nodes within the class C(s). Let Pn-il and S represent the subrota- 
tor graph and the node, respectively. Additionally, let Di = {di, d 2 , ■ • ■ , dfe} 
(fc < n — 1) be the set of destination nodes in Pn~\l- 
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2. For each class to which n — 2 destination nodes in D — {di} belong, let each 
li {1 < i < n — 2) represent a node which is in the class and belongs to 
Pn-ll- 

3. In Pn-il, construct n— 2 internally disjoint paths from s to di [7]. If each path 
includes one of ij’s, the subpath from s to (I 2 is selected, and let £>2 ^ {^ 2 }- 
Otherwise, one of the paths from s to di which do not include the nodes li’s 
at all is selected, and let D 2 ^ {di}- 

4. For each of n — 2 classes to which each node in D — D 2 belongs, let each Vi 
{1 < i < n — 2) represent the node which is in the class and also belongs to 
P n— 1 

5. Obtain the paths from s to Ui {1 < i < n — 2) which are disjoint except for 
s by calling the algorithm recursively. 

6. Select paths from each u, (I < i < n — 2) to the corresponding node in 
D — D 2 within the class. 



Pn-\n 




Fig. 11. Case II-2-B-b. 



Lemma 8. The n — 1 paths s ^ di (l<i<n — 1) constructed in Procedure 5 
are disjoint except for s. Additionally, the sum of the length of paths established 
in Procedure 5 excluding those which are constructed by the recursive call of this 
algorithm is o/0(n^). 

Proof. The paths selected in the step 5 in Procedure 5 are disjoint except for 
s from the hypothesis of induction. From the property 2 of classes, the paths 
selected in the step 6 are disjoint. The nodes on the paths selected in the step 
5 have the figure n in the last positions of their labels. Additionally, the nodes 
on the paths selected in the step 6 have other figures than n in the last position 
of their labels except for (1 < i < n — 2). Hence, the paths selected in the 
step 5 and the paths selected in the step 6 are disjoint except for s and Vi 
(1 < i < n — 2). Similarly, the paths selected in the steps 1 and 3 and the 
paths selected in the steps 5 and 6 are easily proved to be disjoint except for s. 
Therefore, the n — 1 paths established in Procedure 5 are disjoint except for s. 
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Additionally, the lengths of the paths selected in step 1 and 3 are both of 
0{n). The sum of the length of the n — 2 paths selected in step 6 is of O(n^), 
Summing up them, the sum of the length of paths established in Procedure 5 
excluding those which are constructed by the recursive call of the algorithm is 
ofO(n2). □ 

Theorem 9. The n — 1 paths constructed by Procedures from 1 to 5 are disjoint 
except for s. The sum of the length of paths is ofO{n^). 

Proof. From Lemmas 4 to 8, the proof is trivial. □ 

Theorem 10. The complexity of the algorithm is 0{n^). 

Proof. We assume that a label of a node is represented by a linear array of 
n elements. Let T{n) represent the time complexity of our algorithm for an 
n-rotator graph. 

The first case branching operation in our algorithm is begun with calculation 
of nodes each of which belongs to the same class as each destination node and 
also belongs to the subrotator graph Pn~\n. These nodes can be used as repre- 
sentatives of classes to which destination nodes belong. This calculation takes 
0{n^) of time complexity. The other tests for other branching can be performed 
less than O(n^). Now, the destination nodes are classified into classes and each 
class has a subset of destination nodes. 

In Procedure 1, steps 2 and 4(b)viii are governing and they take O(n^) of 
time complexity. The algorithm for the node-to-node disjoint paths problem 
included in step 4(b)viii is of O(n^) of time complexity and the sum of the 
length of paths is of 0(n^)[7]. Hence, the time complexity of Procedure 1 is 
T(n) = T(n-l) + 0(n^). 

Procedure 2 is governed by step 5 which is of O(n^) of complexity. Therefore, 
the time complexity of Procedure 2 is T(n) = T(n — 1) -b O(n^). 

In Procedure 3, the check after the recursive call in step 2 is governing. 
The sum of length of paths is of O(n^), and it take 0(n) of time to compare 
two labels, step 3 requires O(n^) of time complexity. Hence, the complexity of 
Procedure 3 is T(n) = T{n — 1) -b O(n^). 

In Procedure 4, step 4 and 7 takes O(n^) of time complexity and it is gov- 
erning. So, the complexity of Procedure 4 is T(n) = T{n — 1) -b O(n^). 

Procedure 5 is governed by step 3 which is similar to step 4(b)viii of Proce- 
dure 1 and takes O(n^) of time complexity. Therefore, the time complexity of 
Procedure 5 is T{n) = T{n — 1) -b O(n^). 

From above discussion, the time complexity of our algorithm is represented 
by T{n) = T{n — 1) -b O(n^) which results in T(n) = 0(n®). □ 

4 Conclusions 

In this paper, we have presented an algorithm for the node-to-set disjoint paths 
problem in n-rotator graphs which is of polynomial order of n. Future works 
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include measurement of average sum of paths by computer simulation and im- 
provement of the algorithm. 
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Abstract. We present new complexity results for simulation-checking and model- 
checking with infinite-state systems generated by pushdown automata and their 
proper subclasses of one-counter automata and one-counter nets (one-counter 
nets are ‘weak’ one-counter automata computationally equivalent to Petri nets 
with at most one unbounded place). 

As for simulation-checking, we show the following: a) simulation equivalence 
between pushdown processes and finite-state processes is EXPTIME-complete; 
b) simulation equivalence between processes of one-counter automata and finite- 
state processes is coNP-hard; c) simulation equivalence between processes of 
one-counter nets and finite-state processes is in P (to the best of our knowledge, 
it is the hrst (and rather tight) polynomiality result for simulation with inhnite- 
state processes). 

As for model-checking, we prove that a) the problem of simulation-checking be- 
tween processes of pushdown automata (or one-counter automata, or one-counter 
nets) and finite-state processes are polynomially reducible to the model-checking 
problem with a fixed formula ip = vX.[z\ {z)X of the modal /i-calculus. Conse- 
quently, model-checking with p is EXPTIME-complete for pushdown processes 
and coNP-hard for processes of one-counter automata; b) model-checking with 
a fixed formula 0[a]0[&]ff of the logic EF (a simple fragment of CTL) is NP- 
hard for processes of OC nets, and model-checking with another hxed formula 
□ (o)D(fe)tt of EF is coNP-hard. Consequently, model-checking with any tem- 
poral logic which can express these simple formulae is computationally hard even 
for the (very simple) sequential processes of OC-nets. 



1 Introduction 

Two important approaches to formal verification of concurrent systems are equivalence- 
checking and model-checking. In both cases, a process is formally understood to be 
(associated with) a state in a transition system, which is a triple T = (S,Act,^) 
where 5 is a set of states. Act is a finite set of actions, and C S x Act x 5 is a 
transition relation. We write s ^ t instead of(s,a,t) £ ^ and we extend this notation 
to elements of Act* in the natural way. A state t is reachable from a state s, written 
s t, iff s ^ t for some w £ Act* . 

In the equivalence-checking approach, one describes the specification (the intended 
behavior) and the actual implementation of a concurrent process as states in transi- 
tion systems, and then it is shown that they are equivalent. Here the notion of equiva- 
lence can be formalized in various ways according to specific needs of a given practical 
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problem (see, e.g., [24] for an overview). A favorite approach is the one of simulation 
equivalence which has been found appropriate in many situations and consequently its 
accompanying theory has been developed very intensively. Let T = (5, Acf, be a 
transition system. A binary relation i? C S' x 5 is a simulation iff whenever (s, t) e R, 
then for each s s' there is some t t' such that {s', t') & R. A process s is simu- 
lated by t, written s Cg t, iff there is a simulation R such that (s, t) G R. Processes s, t 
are simulation equivalent, written s =s t, iff they can simulate each other. Simulation 
can also be viewed as a game — imagine there are two tokens put on states s and t. 
Now two players, A1 and Ex, start to play a simulation game which consists of a (possi- 
bly infinite) number of rounds where each round is performed as follows: A1 takes the 
token which was put on s originally and moves it along a transition labelled by (some) 
a; the task of Ex is to move the other token along a transition with the same label. A1 
wins the game iff after a finite number of rounds Ex cannot respond to Al’s final attack. 
We see that s f iff Ex has a universal defending strategy, i.e., A1 never wins pro- 
vided Ex plays in a sufficiently ‘clever’ way. We use simulation game as some points 
to give a more intuitive justification for our claims. Finally, let us note that simulation 
can also be used to relate states of different transition systems; formally, two systems 
are considered to be a single one by taking their disjoint union. 

In the model-checking approach, desired properties of the implementation are en- 
coded as formulae of certain temporal logic (interpreted over transition systems) and 
then it is demonstrated that the implementation satisfies the formulae. There are many 
systems of temporal logic differing in their expressive power, decidability, complexity, 
and other aspects (see, e.g., [23,6]). In this paper we only work with one (fixed) formula 
(fi = vX.\z\{z)X of the modal ji-calculus [13] and some other (fixed) formulae of its 
very simple fragment which is known as the EF logic (the logic EE can also be seen as 
a natural fragment of CTL [6]). A formal definition of the syntax and semantics of the 
modal /i-calculus is omitted due to space constraints (we refer, e.g., to [13]). However, 
we do explain the meaning of (p in Section 3. Formulae of the logic EF look as follows: 

tjj ::= tt \ip Alp \ I {a)tp \ <>ip 

Here a ranges over a given set of atomic actions. Dual operators to (a) and O are [a] and 
□, defined by [a]tfj = and Hip = -^<>^tp, respectively. Let T = {S, Act, — >) 

be a transition system. The denotation [^/;] of a formula xp is the set of states where the 
formula holds', it is defined as follows: 

|ttl = S 

{tpi A tp2j = I^i] n I1P2] 

hV'l = -S' - M 

1(a)'!/'] = {s e S \ 3t e S 
lO!/;] = {s&S\3teS 

The ‘language’ of transition systems is not very practical - concurrent systems of- 
ten have a very large (or even infinite) state-space and hence it is not feasible to define 
their semantics ‘directly’ by means of transition systems. Therefore, ‘higher’ languages 
allowing to construct compact definitions of large systems have been proposed and 



: s-^tAte M} 
: s^* tAte 
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studied. In this paper we mainly work with (subclasses of) pushdown automata, which 
are considered as a fundamental model of sequential behaviors in the framework of 
concurrency theory (for example, one can conveniently model programs consisting of 
mutually recursive procedures in the syntax of PDA, and existing verification tech- 
niques for PDA are then applicable to, e.g., some problems of data-flow analysis [7]). 
Formally, a pushdown automaton is a tuple A = (Q, F, Act, 6) where Q is a finite 
set of control states, F is n finite stack alphabet. Act is a finite input alphabet, and 
5 \ {Q X F) ^ ) is a transition function with finite image. We can as- 

sume (w.l.o.g.) that each transition increases the height (or length) of the stack at most 
by one (each PDA can be efficiently transformed to this kind of normal form). In the 
rest of this paper we adopt a more intuitive notation, writing pA q/3 £ d instead of 
(a, (q, /?)) G S{p, A). To A we associate the transition system 7^ where Q x F* is the 
set of states (we write pa instead of (p, a)). Act is the set of actions, and the transition 
relation is determined by pAa q(3a <1=> pA q(5 & 6. 

A natural and important subclass of pushdown automata is the class of one-counter 
automata where the stack behaves like a counter. Such a restriction is reasonable be- 
cause in practice we often meet systems which can be abstracted to finite-state pro- 
grams operating on a single unbounded variable. For example, network protocols can 
maintain the count on how many unacknowledged messages have been sent, printer 
spool should know how many processes are waiting in the input queue, etc. Formally, 
a one-counter automaton ,4 is a pushdown automaton with just two stack symbols 
I and Z\ the transition function d of Z\ is a union of functions 5z and 5i where 

: (Q X {Z}) Si :{Qx {/}) ^ Hence, 

Z works like a bottom symbol (which cannot be removed), and the number of pushed 
Fs represents the counter value. Processes of A (i.e., states of TA) are of the form pFZ 
which is abbreviated to p{i) in the rest of this paper. Again, we assume (w.l.o.g) that 
each transition increases the counter at most by one. A proper subclass of one-counter 
automata of its own interest are one-counter nets. Intuitively, OC-nets are ‘weak’ OC- 
automata which cannot test for zero explicitly. They are computationally equivalent to 
a subclass of Petri nets [22] with (at most) one unbounded place. Hence, one-counter 
nets can be used, e.g., to model systems consisting of producers and consumers which 
share an infinite buffer (a non-empty buffer enables the execution of consumers but it 
need not be tested for zero explicitly). Formally, a one-counter net A/" is a one-counter 
automaton such that whenever pZ qFZ e 5, then pi qF~^^ G 6. In other words, 
each transition which is enabled at zero-level is also enabled at (each) non-zero-level. 
Hence, there are no ‘zero- specific’ transitions which could be used to ‘test for zero’. 

The state of the art: Let PDA, BPA, OC-A, OC-N, and FS be the classes of all 
processes of pushdown automata, stateless pushdown automata, one-counter automata, 
one-counter nets, and finite-state systems, respectively. Moreover, let PN, BPP, and PA 
denote the classes of all processes of Petri nets [22], basic parallel processes [5], and 
process algebra [4], respectively. The problems of simulation preorder and simulation 
equivalence between processes of classes A and B are denoted by A Cg B and A =g B, 
respectively. The problem of simulation-checking with (certain classes of) infinite-state 
systems has been attracting attention for almost a decade; here we only mention some 
of the most relevant results. First, it was shown in [8] that the problems BPA Cg BPA 
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and BPA =s BPA are undecidable. The undecidability of BPP Cs BPP and BPP =s 
BPP was proved in [9]. An interesting positive result is [1] where it is shown that 
OC-N OC-N (and hence also OC-N =s OC-N) is decidable. However, OC-A Cs 
OC-A and OC-A OC-A are already undecidable [12]. The problem of checking 
simulation between infinite and finite-state systems was first examined in [11] where it 
is shown that PN Tg FS, FS PN, and PN =s FS are decidable. A similar positive 
result was later demonstrated in [16] for the PDA FS, FS IT;, PDA, and PDA 
FS problems; some complexity estimation were also given (see below). Moreover, the 
problems PA Cg FS, FS Cg PA, and PA =s FS are proved to be undecidable. 

The decidability and complexity of checking other behavioral equivalences (in par- 
ticular, strong and weak bisimilarity [21,20]) between infinite and finite state systems 
also exist; we give a short comparison in the final section. 

Our contribution: In our paper we present new complexity results for simulation- 
checking and model-checking problems with the above mentioned subclasses of push- 
down processes. The most significant original contributions are summarized below to- 
gether with a short discussion on previous work. 

- PDA =« FS is EXPTIME-complete. Previously, there was a coNP lower bound 
for the problem [16] (this lower bound also works for BPA processes). In the same 
paper, the membership of PDA =s FS to EXPTIME has also been shown, hence 
here we only need to prove the EXPTIME lower bound. 

- OC-A =, FS is coNP-hard. The problem whether this lower bound is tight is left 
open. Intuitively, the problem should be expected easier then for PDA processes, 
because there is a substantial simplification in the case of strong bisimilarity - the 
problem of strong bisimilarity with finite-state processes is in P for OC-A processes 
[14], but PSPACE-complete for PDA processes [19]. 

- OC-N =s FS is in P. In fact, we show that OC-N FS and FS OC-N are 
in P. To the best of our knowledge, this is the first (and rather tight) polynomiality 
result for simulation with infinite-state systems. Let us note that some equivalence- 
checking problems between processes of OC-nets and FS processes are still hard 
(for example, weak bisimilarity is DP-hard [14]), so the result is not immediate (see 
also the comments below). 

- Next, we show that the problems of simulation preorder/equivalence between pro- 
cesses of PDA (or OC-A , or OC-N) and FS processes are reducible to the model- 
checking problem for the fixed formula (p = vX.\z\{z)X of (fhe alternalion-free 
fragment of) the modal /r-calculus. It is essentially a simple observation which was 
(in a similar form) used already in [2,16,18]. The point is that (due to the previ- 
ous hardness results) we can conclude that the problem of model-checking with p 
is EXPTIME-complete for PDA processes (the upper-bound is due to [25]) and 
coNP-hard for OC-A processes. An interesting thing is that the model-checking 
problem for stateless pushdown (i.e., BPA) processes and any fixed formula of the 
modal /r-calculus is already polynomial [25]. The classes of BPA and OC-A pro- 
cesses are rather natural but incomparable subclasses of PDA processes - we see 
that the absence of a finite control is a ‘stronger’ simplification than the replacement 
of the storage device (counter instead of stack) in this case. 

As simulation between OC-N and FS processes is in P, the aforementioned tech- 
nique does not yield any hardness result for model-checking with OC-N processes. 
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Therefore, we examine the problem directly - we prove that even model-checking 
with a simple fixed formula 0[a]0[6]ff of the logic EF is NP-hard for OC-N 
processes, and model-checking with another fixed formula □(a)D(6)tt is coNP- 
hard. Hence, we can forget about an efficient model-checking procedure for OC-N 
processes and any modal logic which can express these simple formulae (unless 
P = NP). 

2 Results about Equivalence-Checking 

Theorem 1. The problem of simulation equivalence between PDA processes and de- 
terministic FS processes is EXPTIME-hard. 

Proof. We show EXPTIME-hardness by reduction from the acceptance problem for 
alternating LB A (which is known to be EXPTIME-complete). An alternating LBA is a 
tuple Ai = (Q, S, 6, go, h, H, p) where Q, S, 5, go, h, and H are defined as for ordinary 
non-deterministic LBA (in particular, h and H are the left-end and right-end markers, 
resp.), and p : Q ^ {V, 3, acc, rej} is a function which partitions the states of Q 
into universal, existential, accepting, and rejecting, respectively. We assume (w.l.o.g.) 
that 6 is defined so that ‘terminated’ configurations (i.e., the ones from which there are 
no further computational steps) are exactly accepting and rejecting configurations. A 
computational tree for Ad on a word w E S* is any (finite or infinite) tree T satisfying 
the following: the root of T is (labeled by) the initial configuration goLwH of Ad, and 
if iV is a node of Ad labeled by a configuration ugv where u,v £ S* and g e Q, then 
the following holds: 

- if g is accepting or rejecting, then T is a leaf; 

- if g is existential, then T has one successor whose label is (some) configuration 
which can be reached from ugv in one computational step (according to S); 

- if g is universal, then T has m successors where m is the number of all configura- 
tions which can be reached from ugv in one step; those configurations are used as 
labels of the successors in one-to-one fashion. 

Ad accepts w iff there is a finite computational tree T such that all leaves of T are 
accepting configurations. 

Now we describe a polynomial algorithm which for a given alternating LBA Ad = 
{Q, S, S, go, L, H,p) and a word w E S* constructs a process P of a PDA system A 
and a process F of a finite-state system T such that 

- P Cs F, and 

- F Cg P iff Ad does not accept u>. 



Hence, Ad accepts w iff P f=s F and we are (virtually) done. 

Intuition: The underlying system F of F looks as follows (note the F is determin- 
istic): 
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Intuitively, the goal of F is to demonstrate that there is an accepting computational 
tree for Ad on w, while P aims to show the converse. The game starts with the initial 
configuration gohwH stored in the stack of P. Now F ‘chooses’ the next configuration 
(i.e., the rule of 6 which is to be applied to the current configuration stored at the top of 
stack) by emitting one of the nexti actions. The quotes are important here because P is 
constructed in such a way that it has to accept the choice of F only if the control state 
of the current configuration is existential. If it is universal, P can ‘ignore’ the dictate of 
F and choose the next configuration according to its own will. The new configuration is 
then pushed to the stack of P (technically, it is done by guessing individual symbols and 
an auxiliary verification mechanism is added so that P cannot gain anything if it starts 
to cheat). As soon as P enters an accepting configuration, it ‘dies’ (i.e., it is not able to 
emit any action); and as soon as it enters a rejecting configuration, it starts to behave 
identically as F. Hence, if there is an accepting computational tree for Ad on w, then 
F can force P to enter an accepting configuration in finitely many rounds (and hence 
F %s P)- If there is no accepting computational tree, then P can successfully defend; it 
either enters a rejecting configuration or the game goes forever. It means, in both cases, 
that F CIs P. Moreover, a careful design of P ensures that P F regardless whether 
Ad accept w or not. A full (formal) proof is omitted due to space constraints; it can be 
found in [15]. □ 

In the proof of our next theorem we use the technique for encoding assignments of 
Boolean variables in the structure of one-counter automata discovered in [14]. 

Theorem 2. The problem of simulation equivalence between OC-A processes and FS 
processes is coNP-hard. 



Proof. We show coNP-hardness by reduction of the coNP-complete problem Unsat. 
An instance is a Boolean formula f in CNF. The question is whether f is unsatisfiable. 

Let ■!/; = Cl A ■ • ■ A Cm be a formula in CNF where Ci are clauses over propositional 
variables , • • • , We construct (in polynomial time) a process H of a OC-A system 
A and a process C of a finite-state system T such that P Fs F iff f is unsatisfiable. 
Then we simply consider the processes P' and F' which have the following outgoing 
transitions: P' A, p p' Ff ^nd F' ^ F where x is a fresh action. Observe that 
P' is easily definable in the syntax of one-counter processes and F' in the syntax of 
finite-state processes. Clearly F' Cg P', and P' Cs F' iff P Cg F. In other words, 
P' =g F' iff f is unsatisfiable and it proves our theorem. 

It remains to show the construction of P, A, F, and F . The set of actions of A 
and F is Act = {a, b,ci, . . . , Cm}- Let Ai = Act — {a, Cj}. The set of states of F is 
{F, Fi, . . . , Fm} and its transitions are F A, F, F ^ Fi for each 1 < i < m, and 
Fi Ft for each y ^ Ai and each 1 < i < m. Hence, the system F looks as follows: 
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In the construction of A we rely on the following theorem of number theory (see, 
e.g., [3]): Let pi be the prime number, and let /: IN ^ IN be a function which 
assigns to each n the sum X^"=i Pi- Then / is 0{n^). This fact ensures that A has only 
polynomially-many control states (see below). 

The set of control states Q of Z\ is {s, r} U {s(pi j) ll<i<n,0<j< pi}. For 
each 1 < i < n we now define two sets of actions. 

- Bi = {cj I 1 < j < m, the variable Xi appears positively in the clause Cj} 

- Bi = {cj I 1 < j < m, the variable Xi appears negatively in the clause Cj} 

Transitions of A are defined as follows: 

- sZ A sIZ, si A si I, si A rl, 

- rl S(p._o>-^ for each 1 < z < n, 

- S(^p.j-fl mod pi)£ for each 1 < z < n and each 0 < j < pi, 

y 

- for each 0 < z < n and each y G Bi. 

- S(^p.j)Z A S(^p.j)Z for each 0 < z < n, each 1 < j < pi, and each y e Bi. 

The structure of the transition system associated to A is depicted in the following figure 
(transition systems associated to OC systems can be viewed as two-dimensional ‘tables’ 
with an infinite height where control states are used as column indexes and counter 
values as row indexes; as the outgoing transitions of a process p{i) for z > 0 do not 
depend on the exact value of z, it suffices to depict the out-going transitions at the zero 
level and (some) non-zero level): 




The initial state is s(0). Intuitively, P first increases its counter, emitting a sequence 
of a’s. Then it emits the first b action and changes its control state to r (preserving the 
value stored in the counter). To each state r(l) we associate the (unique) assignment i/i 
defined by vi{xi) = tt iff r{l) ->* S(p._o)(0) (i.e., vi{xi) = ff iff r{l) ->* S(pij)(0) 
for some 1 < j < pi). Conversely, for each assignment v there is f G IN such that 
V = vi (for example, we can put I = II"‘^Qf{j), where /(j) = pj if v{xj) — tt, and 
/(j) = 1 otherwise). Now it is easy to check that a clause Ck is true for an assignment 
vi iff at least one of the ‘bottom’ states (0) where a Cfc-loop is enabled (see above) 
is reachable from r{l). 
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Let s{l) be a state of P such that = f f . It means that there is 1 < fc < m such 
that vi{Ck) = f f • Hence, the process F can safely match the transition s{l) r{l) by 
F Fk (from that point on it can do everything except Cfe). However, if there is some I 
such that i/i (t/)) = tt, then F does not have any ‘safe’ matching move for the transition 
s{l) r{l) because none of its Fk successors can do all of the Ci actions. Hence, ip is 
unsatisfiahle iff s(0) Cg F. □ 

Now we prove that simulation preorder and simulation equivalence between pro- 
cesses of one-counter nets and finite-state processes can be decided in polynomial time. 
To the best of our knowledge, these are the first polynomiality results for simulation 
with infinite- state systems. Intuitively, the crucial property which makes our proofs 
possible (and which does not hold for general one-counter automata) is the following 
kind of ‘monotonicity’ — if p{i) is a process of a one-counter net, then p{i) Cj, p(j) 
for every j > i. 

It should he noted that in our next constructions we prefer simplicity to optimality. 
Therefore, it does not pay to evaluate the degrees of polynomials explicitly (though 
it would be of course possible) because they would considerably decrease after some 
straightforward optimizations. Our only aim here is to prove the membership to P. 

Let T = (S, Act, -^) be a transition system. A family of C* , i G INq relations is 
defined inductively as follows: 

- s Cg f for all s, f e 5; 

- s t iff s L® t and for each s A s' there is some t ^ t' such that s' L® t'. 

Intuitively, s L® f iff Ex has a defending strategy for the first i rounds of the simula- 
tion game. If we restrict ourselves to processes of finitely-branching transition systems 
(where each state has only finitely many a-successors for every action a), then s t iff 
s C® f for every i e INq (observe that transition systems generated by PDA are finitely- 
branching). This enables the following (straightforward) polynomial-time algorithm for 
checking simulation between finite-state processes: 

Lemma 1. Let F = (F, Act, and Q = (G, Act, be finite-state systems with m 
and n states, respectively. Let k = m-n. For all f ^ F and g a G we have that f g 
iff f 9 iff f Es 9- Moreover, the relation izj can be computed in time which is 

polynomial in the size of F and Q. 

Proof. If we start to construct the family of ‘Cy relations according to the above stated 
definition, we must reach the greatest fixed-point after (at most) k refinement rounds, 
because ‘L°’ contains only k elements and E* c c®+i for each i G INq. It is clear that 
each refinement step can he computed in time which is polynomial in the size of F and 

Q. □ 

Lemma 2. The problem whether a OC-N process can be simulated by a finite-state 
process is in P. 

Proof. Let M = {Q, {/, Z}, Act, 5) be a one-counter net and F = {F, Act, — i-) a 
finite-state system. We show that (a description of) the simulation preorder between 
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processes of J\f and JC can be computed in time which is polynomial in the size of J\f 
and JC. 

The first step of our algorithm is a construction of a characteristic finite-state sys- 
tem of M, denoted which is defined as follows: = (Q, Act^ where Q = 

{p \ p & Q} and p q iff pi qP G d{p, I) for some i G INq. Hence, a process 
p of Px intuitively corresponds to a ‘limit process’ p(cxd) of Af (in particular, observe 
that p{i) Cs p for all p G Q and i G INq). It is obvious that the system can be 
constructed in linear time. 

Next, for all p G Q and f e F we check whether p Cg /. It can be done in 
polynomial time (see Lemma 1). Now observe that if p / for given p and /, we can 
conclude that p{i) fZs f for any i G INq, because p(i) Clg p. If p /, then p f 
where k = |Q| • |F| (see Lemma I). Hence, p can win the simulation game over / in 
(at most) k steps. It is clear that the process p(fc) can ‘mimic’ this winning strategy of 
p, because the counter can be decreased at most by k within the first k moves (note 
that if we allowed to test the counter for zero, then p(fc) could not mimic the first k 
moves of p in general). The same applies to any process p(z) where i > k, because then 
p(fc) Lg p(i)- To sum up, at this point we know if p{i) Cg / for all p G Q, f G F, 
and i > fc. It remains to decide simulation between pairs of the form (p(i), /) where 
0 < i < fc. As there are only \Q\ ■ \F\ ■ k = |Qp ■ |Fp such states, we can use a simple 
refinement technique similar to the one of Lemma 1 . Formally, we define a family of 
W relations inductively as follows: 

- n° = {(p(z),/) \ i< k,p e QJ e F} 

- consists of those pairs of the form (p(i), /) for which we either have that 
p Cg g, or (p(i), /) G TZ^ and for each move p{i) q{l) there is a move f g 
such that g Cg g or {q{l),g) G W . 

Let TZ be the greatest fixed point of this refinement procedure. First, observe that TZ 
is computable in P because it is reached in (at most) jQp ■ |F|^ refinement steps and 
each step can be obviously computed in polynomial time. Now let us consider a pair of 
the form (p(z), /) where p G Q, z < fc, and / G F'. If (p(z), /) ^ TZ, then obviously 
pil) /■ On the other hand, if (p(i), /) G TZ, then p(z) Cg / because we can readily 
confirm that the relation 7?.U{(g(/),(ji) \ q eQ,l e INo ,5 G F,q ITg g} is a simulation. 

□ 

Lemma 3. The problem whether a finite-state process can be simulated by a OC-N 
process is in P. 

Proof. Let M = {Q, {/, Z}, Act, 5) be a one-counter net and P = [F, Act, a 
finite-state system. Similarly as in the previous lemma we show that (a description 
of) the simulation preorder between processes of P and N can be computed in time 
which is polynomial in the size of Al and P . However, the argument is slightly more 
complicated in this case. 

We start with one auxiliary definition. For all / G F and p G Q we define the 
frontier counter value, denoted V{f,p), to be the least i G INq such that / IZg p(i); if 
there is no such i, we put V(/, p) = — 1 . Our aim is to show that every frontier counter 
value is bounded by |Q| ■ |F|, i.e., V(/,p) < |<5| ■ |F| for all / G F and p G Q. Let m 
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be the maximal frontier value. It suffices to prove that for each n such that I < n < m 
there are f & F and p ^ Q such that V{f,p) = n. Let us suppose the converse, i.e., 
there is some n > 1 such that there is at least one frontier value greater then n, some 
frontier values are (possibly) less than n, but no frontier value equals to n. It follows 
directly from the definition of frontier points that the greatest simulation among the 
processes of F and Af is the following relation TZ: 

^ = {(/,pW) \f^F,pe Q,v{f,p) > 0,^ > v{f,p)} 

Now we show that if there is some n with the above stated properties, than we can 
actually construct a simulation which is strictly larger than TZ, which is a contradiction. 
Let TZ' be the following finite relation: 

F-' = {{g,q{c)) \ g eF,qe Q,V{g,q) >n,c = V{g,q) - 1} 

As n < m, TZ' is clearly nonempty. We show that 72. U 72.' is a simulation. To do 
that, it suffices to check the simulation condition for pairs of TZ', because TZ itself is 
a simulation. Let (g,q{c)) e TZ' and g ^ h. We need to find some move q{c) a 
such that the pair {h, a) is related by 72 U TZ' . However, as {g, q(c)) € TZ' , we have 
that c = V{g,q) — 1 and hence {g,q{c + 1)) € 72. Therefore, there must be some 
move q{c + 1) r{l) such that {h, r{l)) e 72 (also observe that I > n). It means that 
g(c) A r{l — 1) (here we use the fact that c > 1). Now if [h, r{l — 1)) G 72, we are 
done immediately. If it is not the case, then I is the frontier counter value for h and r by 
definition, i.e., I = V{h, r). As I > n and there is no frontier value which equals to n, 
we conclude that I > n — but it means that (h,r{l — 1)) € 72' by definition of 72'. 

Let k = IQ I ■ |F|. Now let us realize that if we could decide simulation for all 
pairs of the form {f,p{k)) in polynomial time, we would be done — observe that if 
/ Ls p{k), then clearly / p[i) for all i > k. As all frontier counter values are 
bounded by k (see above), we can also conclude that if / %s p{k) then / %s p{i) for 
all i > k. Simulation between the k^ remaining pairs of the form (/, p{i)) where i < k 
could be then decided in the same way as the previous lemma, i.e., by computing the 
greatest fixed-point of a refinement procedure defined by 

- 720 = {(/,p(^)) |/eF,peQ,^<fc} 

- 72-’+^ consists of those pairs of the form (/, p(i)) such that (/, p(z)) G 72^ and for 
each move / A ^ there is a move p{i) A q{l) such that either {g, q{l)) G 72^, or 
I = k and g Tg q{k). 

The greatest fixed-point is reached after (at most) refinement steps and each step can 
be computed in polynomial time. 

Now we prove that simulation for the pairs of the form {f,p{k)) can be indeed 
decided in polynomial time. To do that, we show that / p(fc) iff / p{k). It 

clearly suffices — as p{k) cannot increase the counter to more than 4- fc in 2fc^ 
moves, we can decide whether / p{k) simply by computing the ’ relation 
between the states of the system F and a finite-state system (S, F, — >) where S = 
{(p, i) I p G Q, 0 < i < 2fc^ -I- k} and ^ is given by (p, i) A {q,j) iff p{i) A g(j); 
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then we just look if / {p, k). This can be of course done in polynomial time (see 

Lemma 1). 

Let j e INq be the least number such that / p(^)- Then A1 can win the simula- 

tion game in j rounds, which means that there is a sequence 

(f„PAh)) ^ (fj-UPj-l(h^l)) ^ (fuPi(h)) ^ (fo,~) 

of game positions where / = fj, p{k) = Pj{lj), and fi Pi{h) for each 1 < i < 
j. The ATs attack at a position {fi,Pi{li)) is fi ^ /i-i, and Ex’s defending move 
is Pi{k) ^ pi-i(li-i) (observe that, in particular, fi pi{li) and hence pi{li) 
cannot emit the action ai). Moreover, we assume (w.l.o.g.) that Ex defends ‘optimally’ , 
i.e., fi Pi{k) for each 1 < i < j. The first step is to show that k < 2k for 

each 1 < i < j. Suppose the converse, i.e., there is some i with li > 2k. As the 
counter can be increased at most by one in a single transition, we can select a (strictly) 
increasing sequence of indexes sq, si, . . . , Sfc such that = k + i for each 0 < i < k. 
Eurthermore, as fc = |Q| ■ |F|, there must be two indexes s„, Sy where u < v such that 
fsy, = fsy and ps^ = ■ Let us denote fs^ = fs„ by /' and ps^ = Ps^ by p' . Now we 

see (due to the optimality assumption) that f p'{k + u) and f [fff" p'{k + v). 

As s„ — 1 > Sy, we also have f' P'{f^ + However, as u < u we obtain 

f p'{k + u) Cs p'{k + v), hence f p'{k -\- v) and we derived a 

contradiction. The rest is now easy — if j > 2kf (i.e., if Al cannot win in rounds) 

then there must be some u > v such that /„ = fy, Pu = Pv, and ly = ly. It follows 

directly from the fact that k = \Q\ ■ |F| and that each k is at most 2k. Now we can 
derive a contradiction in the same way as above — denoting fy = fy by f', by 

p', and ly = ly by V, we obtain (due to the optimality assumption) that f' p'{l') 

and f' p'{^')- As M — 1 > z;, we have the desired contradiction. □ 

An immediate consequence of Lemma 2 and Lemma 3 is the following theorem: 

Theorem 3. The problem of simulation equivalence between OC-N processes and FS 
processes is in P. 

3 Results about Model-Checking 

In this section we show that there is a close relationship between simulation-checking 
problems and the model-checking problem for the formula (p = vX.[z\{z)X of the 
modal /r-calculus. It is essentially a simple observation which was (in a similar form) 
used already in [2,16,18]. 

As we omitted a formal definition of syntax and semantics of this logic, we clarify 
the meaning of p at this point. Let T = (5, Act, -^) be a transition system. Let : 
2^ — > 2'® be a function defined as follows: 

f^{M) = {s e N I V(s A s') we have that 3(s' A s") such that s" e M} 

The denotation of p (i.e., the set of states where p holds), written |(p], is defined by 

M = \J{u^s\uc f^{u)} 
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Hence, |(/i] is the greatest fixed-point of the (monotonic) function /^. As usual, we 
write t\= ip instead of t G |(/i] . 

Theorem 4. Let P be a process of a PDA system A = (Q, P, Act, 5), and F a process 
of a finite-state system F = {S, Act, —>■). Then it possible to construct (in polynomial 
time) processes A, B of a PDA system A\ and a process C of a PDA system A 2 such 
that PifsF iff A \= p, F Qs P iff B \= p, and P =s F iff C ^ p. 

Proof. Intuitively, the processes A, B and C ‘alternate’ the transitions of P and F in 
an appropriate way. We start with the definition of Z\i. The set of control states of Z\i is 
Q X S X {Act U {?}) X {0, 1}, the set of actions is Act, the stack alphabet F is F U {Z} 
where Z ^ F is a fresh symbol (bottom of stack). The set of transitions is the least set 
6 satisfying the following: 

- if pX ga is a rule of 6, then (p, F, ?, 0)A A {q, F, a, l)a and {p, F, a, 0)A A 
(q, F, ?, l)a are rules of <5 for each F e S; 

- if F ^ F', then {p, F, ?, l)X A (p, a, 0)A and (p, F, a, l)X A (p, F' , ?, 0)A 
are rules of <5 for all p e Q and X e F; 

Let P = pa. We put A = {p, F,?,0)aZ and B = (j>,F,l ,l)aZ. Observe that A 
alternates the moves of P and F — first P performs a transition whose label is stored 
in the finite control and passes the token to F (by changing 0 to 1); then F emits some 
transition with the same (stored) label and passes the token back to P. The new bottom 
symbol Z is added to ensure that F cannot ‘die’ within A just due to the emptiness of 
the stack. Now it is obvious that P ifs F iff A \= p\ the fact that Q P iff B \= p 
can be justified in the same way. 

The way how to define C is now easy to see - it suffices to ensure that the only 
transitions of C are C A C' and C A C" where C" A A and C" A B. It can be 
achieved by a straightforward extension of Zii . □ 

The proof of Theorem 4 carries over to processes of one-counter automata and one- 
counter nets immediately (observe there is no need to add a new bottom symbol when 
constructing Ai and A 2 because the zero-marker of one-counter systems is never re- 
moved from the stack by definition. 

Corollary 1. The model-checking problem for p is 

- EXPTIME-complete for PDA processes; 

- coNP-hardfor OC-A processes; 

As simulation between OC-N and FS processes is in P, Theorem 4 does not imply 
any hardness result for model-checking with OC-N processes. Therefore, we examine 
this problem ‘directly’ by showing that a simple fixed formula 0[a]0[6]f f of the logic 
EF is NP-hard for OC-N processes. In our proof we use a slightly modified version of 
the construction which was given in [14] to prove DP-hardness of weak bisimilarity 
between OC-N and FS processes. To make this paper self-contained, we present a full 
proof here. 
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Theorem 5. Let p(0) be a process of a one-counter net M. The problem if p{0) \= 
<> [a] O [6] f f is NP-hard. 

Proof. Let p = C\ f\ - ■ ■ f\ Cm be a formula in CNF where Ci are clauses over proposi- 
tional variables xi, • ■ ■ , x„. We construct a OC-N system TV" = (Q, {/, Z}, {a, b, r}, S) 
and its process p(0) such that p is satisfiable iff p(0) ^ 0[a]0[6]f f. The construction 
of Af will be described in a stepwise manner. The sets Q and S are initialized as follows: 
Q = {g}, 6 = {ql ql, qZ qZ}. Now, for each clause Ci, 1 < i < m, we do the 
following: 

- Let TTj denote the prime number. We add a new control state c* to Q. Moreover, 
for each variable xj and each k such that 0 < fc < Tty we add to Q a control state 

{Ci , Xj , k') . 

- For each newly added control state s we add to ^ the transitions si ql, sZ 
qZ. 

- For each 1 < j < n we add to <5 the transitions c*/ ^ {Ci, Xj, 0)1. 

- For all j, k such that I < j < n and 0 < fc < tt^ we add to 5 the transition 

{Ci,Xj, k)I A {Ci,Xj, {k -f 1) mod nj)e. 

- We add to 5 the ‘loops’ c*/ Cil, CiZ CiZ. 

- For all j, k such that 1 < j < n and 0 < fc < Try we add to 5 the loop {Ci, Xj , k)I 

{Ci , Xj , k) I . 

- If a variable Xj does not appear positively in a clause Ci, then we add to S the loop 

{Ci,Xj,0)Z ^ {Ci,Xj,0)Z. 

- If a variable Xj does not appear negatively in a clause Ci, then we add to 6 the loops 
{Ci,Xj,k)Z {Ci,Xj,k)Z for every 1 < fc < Hj. 

If we draw the transition system which is generated by the current approximation of Af, 
we obtain a collection of Gi graphs, 1 < i < m; each Gi corresponds to the ‘subgraph’ 
of the transition system associated to Af which is obtained by restricting Q to the set of 
control states which have been added for the clause Gi. The structure of Gi is shown in 
the following picture (the a-transitions to the states of the form q{j) are omitted as the 
picture would become too complicated). 




0 , ‘Q o o D'’ D'’ D'’ D'’ 

q Cj <C;,Xj,0> <C;,Xj,l> <Cj,X2,0> <Cj,X2,l> <C;,X2,2> <Q,X„,0> <Cf,X„,l> <Q,X„,TC„-1> 



Now we can observe the following: 
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- For each I > 0, the state Ci{l) ‘encodes’ the (unique) assignment in the same 
way as in the proof of Theorem 2, i.e., is defined hy ui{xj) = tt iff Ci{l) — >* 
(Ci, Xj, 0)(0); conversely, for each assignment i/ there is f £ IN such that v = vi 
(for example, we can put I = n^^Qf{j), where f{j) = ttj if i'{xj) = tt, and 
f{j) = 1 otherwise). 

- For each 1 > 0 we have that i'i{Ci) = tt iff Ci{l) \= 0[6]ff. Indeed, observe 
that vi{Ci) = tt iff Ci{l) can reach some of the ‘zero-states’ where the action h is 
disabled. 

We finish the construction of M by connecting the Gi components together. To do that, 
we add two new control states p and r to Q, and enrich 5 by adding the transitions 
pZ pIZ, pi pi I, pi ql, pZ qZ, pi rl, and rl Cil for every 
1 < i < m. The structure of of the transition system associated to M is shown below 
(again, the a-transitions to the states of the form g(j) are omitted). 




Now we can observe the following: 

- The only states which can (potentially) satisfy the formula [a]0[6]ff are those of 
the form r{l), because all other states have an a-transition to a state of the form 
q{j) where is it impossible to get rid of 6’s. 

- A state r{l) satisfies fhe formula f iff Ciil) ^ ^[6]f f for all 1 < z < m iff 

(Ci) = tt for each 1 < z < m (due to the previous observations) iff i/i {(p) = tt. 

Hence, (p is satisfiable iff there is ( G IN such that r{l) satisfies [a]^[6]ff iff p(0) \= 
0[a]0[6]ff. □ 



Corollary 2. Let p(0) be a process of a one-counter net M. The problem if p(0) ^ 
□ (a)n(6)tt is coNP-hard. 



4 Conclusions 

This paper fills some gaps in our knowledge on complexity of simulation-checking and 
model-checking with (subclasses of) pushdown automata. The following table gives a 
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summary of known results (contributions of this paper are in boldface). For compar- 
ison, related results about checking strong and weak bisimilarity (denoted by ~ and 
w, respectively) with finite-state processes are also shown. The overview supports the 
claim that simulation tends to be computationally harder than bisimilarity; to the best 
of our knowledge, there is so far no result violating this ‘rule of thumb’. 





PDA 


BPA 


OC-A 


OC-N 


~FS 


PS PACE-complete [19] 


eP[l7] 


eP[l4] 


e P [14] 


»FS 


PSPACE-hard [19] 
e EXPTIME [10] 


eP[l7] 


DP-hard [14] 


DP-hard [14] 


=. FS 


EXPTIME-complete 


coNP-hard [16] 


coNP-hard 


eP 
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Abstract. Multimedia presentations and their applications are becom- 
ing more and more popular in most spheres of industry and everyday 
life. A database approach could help in querying presentations and re- 
using parts of existing presentations to create new ones. In this paper, 
we propose an object-oriented model built on a temporal interval tree 
structure for managing multimedia presentations as temporal databases. 
This model is based on a class hierarchy that reflects the temporal rela- 
tionships among multimedia data comprising a presentation. Specifically, 
we will discuss extending this approach to reusing animations in our 
presentation database system. Hence, in this paper, based on the object- 
oriented data model and the database system, we propose storing anima- 
tions in a database of 3D geometric models and motion descriptions. The 
reuse of 3D models is a common practice, unlike reusing motion which 
is rare because it is not as straightforward. Thus, we will explore more 
on motion reuse techniques. A set of generic and animation-oriented op- 
erations are given, which can be used to query and adapt animations for 
multimedia presentations based on their metadata. 



1 Introduction 

Currently, multimedia presentations are increasingly being used in most areas 
where computers are utilized. Multimedia-supported lectures, instruction manu- 
als, animated presentations are only a small part of the examples. Therefore, the 
need to set up a system to manage multimedia presentations elegantly with au- 
thoring, querying, and integrated browsing features should be obvious. Databases 
have evolved through years and have found applications in numerous areas. A 
database approach to solve the multimedia presentation’s querying and author- 
ing problem will not only make it easier for users to learn, but also will make 
the presentation system more powerful with the mature techniques in database 
domains. 

Simultaneously, animations are also becoming a popular medium in multime- 
dia presentations. Despite this trend however, computer animated sequences are 
still difficult to produce, even with the advent of fast and powerful machines. 
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graphics accelerators, and high-end graphics authoring software. Thus, novice 
users that require animations in their multimedia presentations may find it dif- 
ficult to produce good animation sequences. The difficulty primary comes from 
generating motion. Creating motion of good quality requires considerable effort 
of skilled animators, actors, and engineers, for example using motion capture 
technology. 

Despite its high cost, motion is not commonly reusable. Motion for a par- 
ticular model is very specific and is unusable for other models and scenarios. 
For instance, the motion of a man picking up a glass will be precisely that - the 
motion will be difficult to reuse for having the character pick up other objects 
from the ground, or even for a different character to pick up the same glass. 
Computer animation research has evolved 4 general strategies to the problem of 
producing motion. The first one is to improve the tools used for key framing. The 
second one uses procedural methods to generate motions based on descriptions 
of goals. The third one tracks motion of real world actors or objects. The fourth 
one attempts to adapt existing motion generated by some other methods. 

The fourth approach could put animation capabilities in the hands of in- 
experienced animators and allow the use of motion created by others in new 
scenarios and with other characters. It can also enable “on-the-fly” adaptation 
of motions. One promising approach to motion adaptation, presented by Bruder- 
lin and Williams [3] , treats motions as signals and applies some traditional signal 
processing to adapt them, while preserving aspects of their character. A variant 
of their interesting methods, motion-displacement mapping, was simultaneously 
introduced as “motion warping” by Witkin and Popovic [8]. 

Database aspects for multimedia presentations are explored in this paper by 
providing a set of algebraic database operations designed for multimedia pre- 
sentations based on the temporal model. These operations help in creating and 
querying presentation databases, together with the user interactive operations 
such as forward, rewind, skip, and link to other database. 

This paper focuses on extending the database approach to animation by 
exploiting motion reuse. This approach is intended for novice or non-skilled 
animators requiring fast and considerably good animations for multimedia ap- 
plications. We view animations in a high-level abstracted view and store them 
in databases. From a pool of animation resources, in VRML format, we add 
descriptions or meta-data to the database. The user can then search the anima- 
tions and find particular objects and motion that he requires for his intended 
animation. He can query scenes or characters using filters or conditions such as 
color or using spatial properties. 

The query results can then be combined and manipulated to make the scene. 
We provide manipulations for scaling, translation and rotation. Temporal as- 
pects of the animation can also be altered, such as extracting only a portion 
of the animation. We also provide a set of motion editing operators for greater 
flexibility. Motion of other characters can be reused to other characters with 
similar structure. 
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2 Related Work 

Most advanced animation tools focus mainly on rendering realistic models and 
is intended for skilled animators. However, there are several software tools that 
caters to novice users. Simply 3D created by Metacreations [9] provides a catalog 
of 3D models that could be dragged into a scene. It also provides features to 
animate the models. However, the results are simple translations, rotations, and 
trivial combinations of the two. 

Creating a database of both models and motion shows great potential. How- 
ever, the main obstacle is how to reuse motion. Recently, there has been a con- 
siderable effort to develop techniques for retargeting motion for new characters 
and scenes. Examples of such works are that of Gleicher’s [4], Popovic [10] and 
Hodgins [6]. 

One related work that makes use of databases in animation is done by Kak- 
izaki [7] , who uses a scene graph and an animated agent in multimedia presenta- 
tions. The animated agent performs a presentation based on a text description. 
In order to generate the agent’s motion automatically, he categorized the mo- 
tion of the agent in the presentation into three classes, namely pointing, moving, 
and gesturing. To determine what the target object is and to extract detailed 
information about the object, the system accesses a scene graph that contains 
all the information about the virtual environments. This information, stored in 
a database, is used for determining details to generate the agent’s motion. 

Similarly, Ayadin et al. [1] have used databases to guide the grasp move- 
ment of virtual actors. Their approach creates a database based on divisions of 
the reachable space of a virtual actor. Objects are classified into a set of primi- 
tives such as blocks, cylinder, and sphere. Both attempts of storing objects into 
databases have been fairly successful. However, the problem of motion reuse is 
not addressed. 

3 Object-Oriented Data Model 

To understand the data model we consider an example multimedia presenta- 
tion with six objects, as shown in Figure 1. The start and end of an object 
presentation can be considered as left and right end points of a line segment. 
The endpoints of these line segments (corresponding to each object presenta- 
tion) are sorted (with duplicates removed) to obtain the sequence yo,yi, 

(m < 2N, N being the number of objects in a multimedia presentation). The 
primary structure of the interval tree is a complete binary tree with m-l-1 ex- 
ternal (i.e., leaf) nodes such that when the tree is flattened and the non- leaf 
nodes are removed, external node i corresponds to yi (the endpoint corre- 
sponding to the start/end of object(s) presentation) [5,11]. Each leaf node is 
labeled with its corresponding endpoint, i.e., y, for the leaf node. Non-leaf 
nodes are labeled with a value that lies between maximum value in its left sub- 
tree and the minimum value in the right subtree (usually, the average of these 
values are used to label the node). Hence, each non- leaf node, say v, serves as a 
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key to a pair of secondary structures LS and RS. LS and RS represent the sets 
of left and right end points of the temporal intervals for which w is a common 
ancestor. Temporal relationships in a multimedia presentation can be viewed as 
an interval tree. For instance, Figure 2 describes the interval tree structure for 
the example multimedia presentation in Figure 1. The interval tree also has two 
other structures, secondary and tertiary, to help in handling different tempo- 
ral queries. From multimedia authoring point of view, we focus on the primary 
structure of the interval tree discussed here. 
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6 10 14 18 22 



Time 

30 



Fig. 1. An Example Multimedia Presentation 



The interval tree representation of temporal relationships in a multimedia 
presentation can be viewed as a class hierarchy in an object-oriented environ- 
ment, as described in Figure 3. Root node that corresponds to the entire pre- 
sentation interval represents the multimedia presentation class. This class has 
two subclasses: non-leaf and leaf. Non-leaf class represents an internal node of 
the interval tree whose interval lies within its parent interval. Leaf class rep- 
resents an external node of the tree whose interval index lies within its parent 
interval. Multimedia data to be presented is a subclass of the leaf nodes. This 
subclass represents the start or the end of the multimedia data presentation cor- 
responding to this leaf node. This subclass may also represent the fact that the 
data is being presented during the interval index represented by the leaf class. A 
multimedia data class can have different choices for rendering purposes, based 
on the languages (of audio, text), data/compression formats, or object qualities 
(to take care of varying availability of system resources such as network band- 
width) . These choices are represented by multimedia data choice class that forms 
a subclass of multimedia data class. 

A multimedia presentation can have links to other presentation databases or 
objects within the same database. These links are represented as a link or button 
class, that forms a subclass of leaf class in the proposed object-oriented model. 
Ultimately all these classes belong to a general class that can be described as 
the multimedia presentation class. IS-A relationship, that is valid in an object- 
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Fig. 2. Interval Tree Representation for the Example Multimedia Presentation 



oriented environment is also true for this case. For example, non-leaf class IS-A 
multimedia presentation class and multimedia data class IS-A leaf class. 




Fig. 3. Object-oriented Representation of Multimedia Presentation 



4 Animation Database 

Animation is a special kind of multimedia data class. It is stored in its own 
database because of its inherent complexity and special attributes. Consequently, 
a presentation of an animation is therefore a part of the leaf class of an interval 
tree. This section discusses how an animation at a particular time is stored in 
the database. 

Geometric models and motion are the two indispensable components of com- 
puter animations. In virtual reality systems, information about the objects in 
the environment is stored in a scene graph [7]. Here we adapt the concept of 
the scene graph to break down an animation into atomic objects, which can be 
extracted, replaced, or deleted. 
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Fig. 4. An animation of a ball rolling on a table. The scene graph for this 
example decomposes the scene into a tree of its atomic objects. The nodes contain 
information about the object including motion. 



The scene graph was originally designed for real-time 3D graphics system. It 
contains data about the objects in a scene, such as their shape, size, color, and 
position. Each piece of information is stored as a node in the scene graph. These 
nodes are arranged hierarchically in the form of a tree structure. 

We have included information about the motion of the object and metadata 
for the motion. Motion can be represented in a number of ways. It can be stored 
as a set of interpolation points, an equation, or a signal. Presently, we store 
motion as a set of interpolation points in our database. 

The queries to retrieve particular objects can be done by looking up the meta- 
data in the scene graph database. We use Virtual Reality Modeling Language 
(VRML) for the purpose of demonstration. Currently, the metadata are made 
manually, but we are working on making it automatic. To be able to achieve this 
we are proposing to integrate some information into the VRML files by includ- 
ing them as comments. Since comments do not affect the VRML browser, it is 
possible to associate metadata with VRML nodes. 

To provide a consistent framework, we also represent the animation in an 
object-oriented data model. The class diagram is shown in Figure 5. An ani- 
mation is stored as a scene class which has categories and objects. Objects in 
turn have a motion class and may also be made up of other objects. This model 
can represent the scene graph of a particular animation and add categories for 
indexing. 

An important consideration is how to store an abstract entity, such as mo- 
tion in a database. To realize this goal we consider Gleicher’s representation of 
motion. Using Gleicher’s framework [4], we can denote motion as a set of sig- 
nals for a particular model. Gonsequently, we use his representation, denoting 
p as a vector and pi as an individual scalar parameter. We therefore have p{t), 
a function of time, defining the model’s configuration. They are represented in 
the system by a set of samples of keyframe points. To use the set of points, we 
interpolate them to create a continuous signal which will then be resampled to 
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Fig. 5. Object-oriented Representation of an Animation. 



produce the new animation. The motion signal is therefore defined by 

p{t) = interp(t, keys), (1) 

where keys are the keyframe coordinates. They are then stored in the database 
as a set of points. 

5 Proposed Operations 

5.1 Generic Database Operations 

Operations for manipulating multimedia presentation structure include insert, 
delete, append, and merge. These operations modify the object hierarchy of a 
presentation database depending on the changes to the interval tree structure. 

1. Insert: insert a specified object or a group of objects in the presentation 
database for the given presentation duration and starting point. The insert 
operation is carried out, modifying the interval tree structure associated with 
the multimedia presentation. 

2. Delete: delete a specified object from the presentation database. This op- 
eration again modifies the interval tree structure associated with the presen- 
tation and hence the object hierarchy associated with the database. 

3. Append: append a presentation database to another. This operation in- 
tegrates the objects in the two specified presentation databases sequentially 
in time domain. 

4. Merge: merge two presentation databases with the same domain. This oper- 
ation integrates the interval tree structure of the two presentation databases, 
by preserving the temporal relationships of both the databases. 

5. Browse: browse a multimedia presentation based on the interval tree struc- 
ture. 
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6. Select: select a multimedia presentation database based on the specified 
conditions. The conditions associated with the select operation can involve 
an object-search term or interval-search term or both. This operation does 
not modify the interval tree structure, and it returns a set of objects as 
response (without the interval tree structure). 

7. Project: project desirable portions of the interval tree associated with a 
multimedia presentation database. Conditions associated with project op- 
eration can involve an object-search term or interval-search term or both. 
The response is a set of objects whose hierarchy reflects the interval tree 
structure of the projected portion of the multimedia presentation database. 

8. Join: join two multimedia presentation databases. It is different from the 
merge operation in the sense that join operation can have conditions that 
need to be satisfied for an object to be part of the response. These conditions 
can involve interval conditions and/or object attribute conditions. 

5.2 User Interactive Operations 

During a multimedia presentation, the user can interact by giving different types 
of inputs. Inputs such as skip and reverse presentation can work on the interval 
tree structure to achieve the desired effect. Inputs such as scaling the speed of 
presentation can modify the intervals associated with interval tree to reflect the 
change in the presentation speed. For carrying out these user interactive opera- 
tions, the proposed object-oriented model supports the following operations. 

1. Compress Interval: the operation helps in scaling down the interval du- 
rations in a multimedia presentation database by a specified factor. 

2. Expand Interval: the operation helps in scaling up the interval durations 
in a presentation database by a specified factor. This operation together with 
compress interval help in handling scaling the speed of data delivery during 
an actual multimedia presentation. 

3. Skip Interval: the operation helps in moving to a time index in a mul- 

timedia presentation by skipping the specified interval, and hence skipping 
the presentation of multimedia data that lie in the skipped interval. 

4. Reverse Browse: the operation helps in traversing the interval tree asso- 
ciated with a multimedia presentation database in the reverse order. This 
operation can be combined with compress/expand interval operations to 
simulate fast or slow rewind VCR type operation. 

These user interactive operations can be provided as functions in the multi- 
media presentation class. In a similar manner, functions can be provided in the 
multimedia presentation class for making choices on languages, data/compression 
formats, or quality of media objects, based on user preferences and/or system 
parameters. 

5.3 Animation Database Operations 

We adapt some of the generic operations to cater to animation. The framework 
comprises of two sets of operators to create animation sequences. They are the 
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query operators and the authoring operators. The query operators are used to 
search for specific characters, scenes, and motion based on some conditions. 
The query operators may use the 12 relationships of the objects: before, behind, 
inside, contain, meet, overlap, above, under, start, finish, left, and right. The 
user may alternately use the metadata of the scenes, objects, categories, and 
motion. 

The authoring operators are then used on the queried results. These are 
operations that can be performed on the database for an animation represented 
in VRML format. There are two aspects in authoring animations. The first aspect 
is the spatial aspect. In animation, combining objects may require changing the 
position, size, and orientation of objects. The operations shown in Table 1 are 
used to query and manipulate the spatial attributes of the animation. 



Table 1. Animation Database Operations and Syntax 



Operation 


Syntax 


Brief Description 


Insert 


INSERT object TO scene 
[AT ( X, y, z ) ] 

[WHEN start time UNTIL stop time] 


Insert object to a scene 
under specified conditions 


Delete 


DELETE object FROM scene 


Delete object from a scene 


Extract 


EXTRACT objects FROM scene 


Extract the specified objects 


Select 


SELECT objects FROM 
scene WHERE [condition] 


Select the objects based 
on specified conditions 


Project 


PROJECT newscene FROM 

oldscene WHERE 

[object /temporal condition] 


Project a specific time interval 
from an animation in a scene 
under specified conditions 


Join 


JOIN scenel scene2 

WHERE [object/temporal condition] 


Join animations of two scenes 
based on specified conditions 



The second aspect is concerned with the temporal information of animation, 
which is basically motion. Motion is an integral part of any animation. Here, we 
propose several authoring operations for retargeting motion for new characters 
and scenes. The operations for manipulating motion are use, get, and disregard. 
The syntax is shown in Table 2. 



Table 2. Motion Operations and Syntax 



Operation 


Syntax 


Brief Description 


Use 

Get 

Disregard 


USE motion TO object 
GET motion FROM object 
DISREGARD motion OF object 


Utilize the motion on a character 
Extract motion from a character 
Delete a motion from an object 







158 Zhiyong Huang et al. 



Among the three operations listed in Table 2, the use is most difficult. We 
have adapted the constraint-based approach proposed by Gleicher [4] . This ap- 
proach combines constraint methods with the motion-signal processing methods. 
It enables us to adapt motion to a new character while retaining the desired 
quality of the original motion. 

6 System Design and Implementation 

We have used JDK1.2 on Solaris system to implement the object-oriented model, 
the database, the user interface, and the link operations. JMF2.0 is employed to 
enable the control operations on media data. Java Project X is used to process 
the XML representation of the object-oriented model. 

As in Figure 6, visualized creation of a multimedia presentation provides con- 
venience to author a presentation as a whole, by giving temporal interval line on 
timing coordinates and then defining or querying their associated objects whose 
spatial and system parameters are defined during creation. Animation is created 
by importing VRML files and specifying the temporal constraints of the files to 
create desired animations. Visualized creation together with proposed database 
operations supply a full package for authoring and manipulating purposes. 




Fig. 6. Visualized Creation of a Multimedia Presentation 



Presentation browser shown in Figure 7 can display the presentation spatially 
and temporally as predefined. Each media file player can also be interactively 
controlled by end users in the course of delivery. Among the players, an animation 
player is included, which is used to view and modify the animation as shown in 
Figure 7. 

6.1 Example 

In this section, we describe an example for creating an animation for a multi- 
media presentation using the operations described above. In this example, the 
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Fig. 7. Delivery of A Multimedia Presentation 



user needs an animation of a walking woman in a room with a brown door. The 
user starts by querying and collecting the necessary models and motion. After 
the query, the user inserts them into a VRML scene with the specified temporal 
parameters. One snapshot of the resulting animation is shown in Figure 8. The 
order of the operations is as follows: 



SELECT door FROM allobjects WHERE COLOR=brown 
EXTRACT woman FROM scene 
GET walking FROM walkingman 
USE walking TO woman 

SELECT room FROM allobjects WHERE TYPE=HDB5A 
INSERT door -scale(30, 30, 30) TO room AT [50,50,50] WHEN [0,10] 
INSERT woman TO room WHEN [6,12] 



7 Conclusion and Future Work 

Authoring, querying, and browsing are three major aspects in multimedia pre- 
sentation area. It can easily be deduced that integrating these three features with 
popular database approaches would be a solution to the increasing needs in mul- 
timedia presentations. In this paper, we have proposed an object-oriented data 
model for multimedia presentation databases. Class hierarchy for this model is 
constructed based on the interval tree structure of temporal relationships in a 
multimedia presentation. 

More specifically, we focus on animation database aspects in this paper, by 
exploring animation database operations and animation scene modeling based 
on the object-oriented model. Hence, it provides a consistent framework for 
authoring and reuse. 

We plan to extend the system to be used on the Internet. It may be possible 
to create a web-based database of the VRML files on the net. These files can 
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Fig. 8. Abstracted view of the proposed system. The animation is produced 
by using the operations to reuse existing models and motion. Woman model 
courtesy of Ballreich [2] . 



then be queried and reused to create the new animation. Since the goal of the 
research is to provide an easier approach to animation, we consequently plan to 
offer more levels of abstraction. We plan to make the framework more flexible 
to make it possible to be adapted for other animation formats. 
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Abstract. We describe an implementation and a proof of correctness 
of binary decision diagrams (BDDs), completely formalized in Coq. This 
allows us to run BDD-based algorithms inside Coq and paves the way for 
a smooth integration of symbolic model checking in the Coq proof assis- 
tant by using reflection. It also gives us, by Coq’s extraction mechanism, 
certified BDD algorithms implemented in Caml. We also implement and 
prove correct a garbage collector for our implementation of BDDs inside 
Coq. Our experiments show that this approach works in practice, and 
is able to solve both relatively hard propositional problems and actual 
industrial hardware verification tasks. 



1 Introduction 

Binary Decision Diagrams (BDDs for short) [9] are a compact and canonical rep- 
resentation of propositional formulae up to propositional equivalence, or equiv- 
alently of Boolean functions. BDDs and related data structures are at the heart 
of modern automated verification systems, based on model-checking [24] or on 
direct evaluation of observable equivalence between finite-state machines [10]. 
These techniques have enabled solving huge verification problems automatically. 
On the other hand, proof assistants like Coq [3], Lego [23] or HOL [15] are expres- 
sive, in that they can address a vast amount of rich mathematical theory (not 
just finite-state machines or Boolean functions). The expressiveness provided by 
such systems makes them eminently suitable for diverse verification tasks, such 
as expressing modular properties of hardware circuits [22] . 

Proof assistants are also safe, in that they rest on firm logical foundations, for 
which consistency proofs are known. While implementations of proof assistants 
might of course be faulty, some design principles help limit this to an absolute 
minimum. In HOL, the main design principle, inherited from LCF [16], is that 
every deduction, whatever its size may be, must be justified in terms of a a 
small number of indisputable elementary logical principles. In Coq and Lego, 
every proof-search phase is followed by an independent proof-checking phase, 
so that buggy proof-search algorithms cannot fool the system into believing 
logically incorrect arguments. 

However, proof assistants are not automatic, even when it comes to special- 
ized domains like Boolean functions. In particular, proof assistants today cannot 
perform automatic model-checking (PVS [27] is a notable exception, see below.) 
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To combine the expressiveness and safety of proof assistants with mo del- checking 
capabilities, there have been proposals to integrate BDDs with proof assistants 
(some are discussed below), which may involve certain trade-offs. If the proof 
assistant is to remain safe, we cannot just force it to rely on the results provided 
by an external model-checker. We might instrument an external model-checker 
so that it returns an actual proof, which the given proof assistant will then be 
able to check. But model-checkers actually enumerate state spaces of astronom- 
ical sizes, and mimicking this enumeration — at least naively, see related work 
below — with proof rules inside the proof assistant would be foolish. 

One remedy to this model-checker (automatic, fast) vs proof assistant (ex- 
pressive, safe) antinomy is to use reflection [32,7,1,4,6], which in this context 
can roughly be thought of as “replacing proofs in the logic by computations” 
(see Section 3). Reflection is particularly applicable in Coq, since Coq is based 
on the calculus of inductive constructions, a logic which is essentially a typed 
lambda-calculus, a quintessential programming language. 

In this paper we describe an implementation of BDDs in Coq using reflec- 
tion. We model BDDs, implement the basic logical operations on them, and 
prove their correctness, all within Coq. This provides a formal proof of the BDD 
algorithms, and also makes available these techniques within the proof assistant. 
Our BDD implementation employs garbage collection. We describe how we im- 
plement a provably correct garbage collector for our BDD implementation. We 
provide experimental results about our BDD implementation on relatively hard 
propositional problems as well as industrial hardware verification benchmarks. 

We must clarify that the aim of this work is not to improve on current verifi- 
cation technology, as far as efficiency or the size of systems verified is concerned. 
Our goal is to produce certified mo del- checkers, and also reflected model-checkers 
that could work as subsystems of Coq. Given that the Common Criteria (CC) 
certification [26] requires the use of certified formal methods at evaluation levels 
5 and higher, our work may be viewed as enriching the set of tools available to 
people and industries engaged in the CC certification process to include proved 
BDD techniques, running at an acceptable speed. It has to be noted that there is 
also increasing interest in independent proof-checking of machine-checked proofs: 
here, Coq is a particularly interesting system to work with, since its core logic 
is so small that stand-alone, independent proof-checkers for it are not too hard 
to implement. In fact, such a proof-checker exists that has even been formally 
certified in Coq [2] (augmented with a strong normalization axiom to get around 
Godel’s second incompleteness theorem). 

The plan of the paper is as follows. We review some related work in the rest 
of the introduction, then give short descriptions of BDDs in Section 2, and of 
Coq in Section 3. Sections 4 and 5 are the technical core of this paper: in Sec- 
tion 4 we describe how BDDs are not only modeled but actually programmed 
and proved in Coq. This involves a number of difficulties; to name a few: we have 
to describe memory allocation, sharing, recursion over BDDs, memoization, and 
to prove that they all work as expected. The organization of the correctness 
proofs is on classical lines: we state representational invariants, which we show 
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are maintained by the BDD algorithms; we provide the semantics of BDDs in 
terms of Boolean functions, with respect to which the correctness of the algo- 
rithms is proven. For the BDD implementation to be really useful in practice, we 
also need to address problems related to memory management. We also describe 
a formally proved garbage collector for our implementation. In Section 4, we only 
describe the assumptions about the garbage collector and show that our BDD 
implementation works correctly under these assumptions. The garbage collector 
we implement is described and then shown to satisfy these assumptions in Sec- 
tion 5. The garbage collector is shown to preserve representational invariants and 
the semantics of all BDD nodes designated as relevant. In Section 6 we report 
experimental results, and discuss speed and space issues. We then conclude in 
Section 7. 

The complete Coq code and proofs can be found at http://www.dyade.fr/ 
f r/act ions/vip/bdd . tgz. 

Related work. 

Closely related to our work is John Harrison’s interfacing of a BDD library 
with the HOT prover [19]. The author’s goal was to solve the validity problem 
for propositional logic formulae rather than to perform model-checking. Reflec- 
tion cannot be employed in HOL, since its logical language does not contain 
a programming sub-language. To be precise, it does contain a A-calculus with 
d?7-equality, but this equality is not implemented as a reduction process as in 
Coq, but through the use of axioms defining it, which then have to be invoked 
explicitly. Therefore, the BDD library must log every BDD reduction step and 
translate each as a proper HOL inference, which HOL will then check. This log- 
ging process is difficult to implement, produces huge logs, and it is necessary 
to use a few tricks in the HOL implementation to keep the HOL proof-checking 
phase reasonably efficient. 

There have also been several publications on integrating model checking with 
theorem proving. Indeed, this is one of the main aims of PVS [27]; but the internal 
model-checker of PVS is itself not checked formally, for now at least. Yu and Luo 
have built a model checker LegoMC [33] which returns a proof term expressing 
why it believes the given formula to hold on the given program. Safety is achieved 
through Lego checking the latter proof-term. However such methods do not seem 
to scale up to complex problems, since they are not symbolic checking methods 
like those based on BDDs. 

Gordon and his colleagues have combined the HOL proof assistant with an 
external BDD package [13,14], allowing the HOL prover to accept results from 
the external BDD package, without any rechecking by HOL. While this approach 
is likely to work faster, its safety is dependent on the reliability of the external 
BDD package as well as the mechanisms for communication between HOL and 
the BDD package. Our approach involves more work but is completely safe. 

There have also been several works on verifying garbage collectors inside 
proof assistants. Goguen et al. [11] model memory as a directed graph, define 
basic operations that a mutator may apply to memory, then show that adding a 
garbage collector (abstracted as Dijkstra et aFs tricolor marking collector) gives 
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a bisimilar, hence behaviorally equivalent, process. Their high-level view does 
not take into account any particular algorithmic detail, whereas we implement 
actual code that is part of a BDD implementation running in Coq and show that 
our implementation works as expected in the presence of this garbage collector. 
Moreover, they verify a garbage-collector for a fairly classical memory architec- 
ture, which does not include persistent hash-tables that should be purged of 
freed elements. So not only do we verify actual code, we also prove a more com- 
plex garbage collection architecture. This last aspect is also a difference between 
our work and others such as [12,21,25,29]. 

2 Binary Decision Diagrams 

BDDs represent a propositional logic formula (built from variables and the con- 
nectives A, V, - 1 , A>) as graphs. It is based on Shannon’s decomposition of any 

formula F as {A ^ F[A := 1]) A {^A => F[A := 0]), where F[A := G] denotes 
substitution of G for Ain F,1 is true and 0 is false. We also use the convenient 
notation F = A — > F[A := 1];F[^ := 0]. By choosing a variable A occurring 
in F, decomposing with respect to it, and recursively continuing the process for 
F[A := 1] and F[A := 0], we get a binary tree (a Shannon tree) whose internal 
nodes have variables and leaf nodes are 1 and 0. 

BDDs are shared, reduced and ordered versions of Shannon trees. Sharing 
means that all isomorphic subtrees are stored at the same address in memory. 
This gives us directed acyclic graphs instead of trees. It is also the main reason 
why BDDs are very small in many applications. Sharing is accomplished by 
having a sharing map which remembers all the BDDs that have been previously 
constructed. Reducing means that a BDD node with identical children represents 
a redundant comparison and can be replaced by a child node. Ordering means 
that we fix some ordering on the variables and have all the variables along any 
path in a BDD occur in that order. These conditions ensure that BDDs are 
canonical forms for propositional formulae up to propositional equivalence [9]. 

One reason why BDDs are interesting is that they are usually compact. Fur- 
thermore, they are easy to work with, at least in principle: all the usual logical 
operations are easily definable by recursive descent on BDDs. Negation, for ex- 
ample, is defined by ^1 =df 0, ^0 =df 1, -^{A — > F',G) =df A — > ^F]^G, 
and any other Boolean operation © (ranging over V, A, =^, A>, etc.) is defined 
as (A — > Fi] Gi) © (j4 — > F 2 ', G 2 ) =df A — > (Fi © F 2 )', (Gi © G 2 ), with the 
usual definitions on 1 and 0. This is called orthogonality of operation ©. 

By remembering in a cache the results of previous computations, negation can 
be computed in linear time, and binary operations in quadratic time w.r.t. the 
sizes of the input BDDs, a notion sometimes called memoizing or tabulating. 



3 An Overview of Coq 

Coq is a proof assistant based on the Calculus of Inductive Constructions (CIC), 
a type theory that is powerful enough to formalize most of mathematics. It 




166 Kumar Neeraj Verma et al. 



properly includes higher-order intuitionistic logic, augmented with definitional 
mechanisms for inductively defined types, sets, propositions and relations. CIC 
is also a typed A-calculus, and can therefore also be used as a programming 
language. For a gentle introduction, see [20]. We shall describe here the main 
features of Coq that we shall need later, and also what reflection means in our 
setting. 

The sorts of CIC are Prop, the sort of all propositions; Set, the sort of all 
specifications, programs and data types; and Type^, i G IN, which we won’t need. 
The typing rules for sorts include Prop : Typep, Set : Typeg, Type, : Typej_|_j^. 

What it means for Prop to be the sort of propositions is that any object 
F : Prop (read: F of type Prop) denotes a proposition, i.e., a formula. If F and 
G are propositions, F -> G is a proposition (read: “F implies G”), and (x :T)G 
is a proposition, for any type T (read: “for every x of type T, G”). In turn, any 
object 7T : F is a proof tt of F. A formula is considered proved in Coq whenever 
we have succeeded to find a proof of it. Proofs are written, at least internally, 
as A-terms, but we shall be content to know that we can produce them with the 
help of tactics, allowing one to write proofs by reducing the goal to subgoals, 
and eventually to immediate, basic inferences. 

Similarly, any object t : Set denotes a data type, like the data type nat 
of natural numbers, or the type nat -> nat of all functions from naturals to 
naturals. Data types can, and will usually be, defined inductively. For instance, 
the standard definition of the type nat of natural numbers in Coq reads: 

Inductive nat : Set := 

0 : nat 

I S : nat -> nat . 

which means that nat is of type Set, hence is a data type, that 0 (zero) is a 
natural number, that S is a function from naturals to naturals (successor), and 
that every natural number must be of the form (S (S ...(SO).. .)) (induction). 

Then, if t : Set, and p : t, we say that p is a program of type t. Programs 
in Coq are purely functional programs: the language of programs is based on a 
A-calculus with variables, application (p q) of p to q (in general, (p qi ... qn) 
will denote the same as (. . . (p qi) . . .qn)), abstraction [x : t]p (where x is a 
variable; this denotes the function mapping x of type t to p), case splits 
(e.g., Cases n of 0 => v I (S m) => (f m) end either returns v if the nat- 
ural number n is zero (0) or (f m) if n is the successor of some integer m), and 
functions defined by structural recursion on their last argument. For the latter, 
consider the following definition in Coq’s standard prelude: 

Fixpoint plus [n:nat] : nat -> nat := 

[m:nat] Cases n of 
0 => m 

I (S p) => (S (plus pm)) 
end 

This is the standard definition of addition of natural numbers, by primitive 
recursion over the first argument. For sonndness reasons, Coq refuses to accept 
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any non-terminating function. In fact, Coq refuses to acknowledge any Fixpoint 
definition that is not primitive recursive, even though its termination might be 
obvious. 

In general, propositions and programs can be mixed. We shall exploit 
this mixing of propositions and programs in a very limited form: when tt 
and n' are programs (a.k.a., descriptions of data), then 7r=7r' is a propo- 
sition, expressing that tt and n' have the same value. Reflection can then 
be implemented as follows: given a property P : t -> Prop, write a pro- 
gram TT : t -> bool deciding P. (Note, by the way, that bool is defined by 
Inductive bool : Set := true : bool I false : bool, and should be dis- 
tinguished from Prop). Then prove the correctness lemma: 

(x : t) (tt x) = true -> (P x) 

To show that P holds of some concrete data Xq (without any free Coq variable), 
you can then compute (tt xq ) in Coq’s programming language. If the result is 
true, then (tt xo)=true holds, and the correctness lemma then allows you to 
conclude that (P xq ) holds in Coq’s logic. 

More concretely, assume we are interested in the validity of Boolean expres- 
sions. The type of Boolean expressions is: 



Inductive bool_expr : Set := 

Zero ; bool_expr 
I One : bool_expr 
! Var : BDDvar->bool_expr 
I Neg : bool_expr->bool_expr 
I Or : bool_expr->bool_expr->bool_expr 
! And : bool_expr->bool_expr->bool_expr 
I Impl : bool_expr->bool_expr->bool_expr 
I Iff : bool_expr->bool_expr->bool_expr . 



(* false *) 

(♦ true *) 

(* propositional variables *) 
(* negation *) 

(* disjunction *) 

(* conjunction *) 

(* implication *) 

(* logical equivalence ♦) 



(The type BDDvar of variables will be described in Section 4.1.) Given an 
environment p mapping variables to truth values (in bool), it is easy to define 
the truth- value of a given Boolean expression be under p, by standard evaluation 
rules: Zero is always false, (Var x) is true if and only if p(x) is, (And he\ bc2) 
is true provided both be\ and bc2 are, and so on. A Boolean expression is valid 
if and only if its value is true under every environment p. 

It is often the case that we are interested in showing that a given Boolean 
expression be is valid. The standard proof of it is to list its free variables, say 
xi, . . . , x„, then do a case analysis on the truth- value of xi; in each of the two 
cases (xi true, or x\ false), do a case analysis on X 2 , and so on. In the worst case, 
this requires writing a proof of size 0(2”), which is tedious or even infeasible for 
n large enough. Furthermore, even though writing this proof may in principle 
be automated, by writing a computer program whose role is to output all the 
necessary proof steps to be fed to the Coq proof engine, the sheer size of it will 
make its verification consume too much time and space to be feasible at all, for 
n > 30 at least on current machines. 

Bypassing this problem can be accomplished by devising proof rules, and 
showing that they are sound with respect to the semantics of Boolean expres- 
sions. We then let the user, or some computer program implementing a decision 
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procedure for propositional logic, generate the proper sequence of applications 
of these proof rules to establish that be is valid (which is possible provided 
these proof rules are complete), and the proof assistant will then re-check that 
this sequence of applications is correct. This is what Boutin [6] calls partial re- 
flection. This is also what Harrison [19] implemented in HOT, using BDDs as 
decision procedure; we have already seen in Section 1 why this solution was not 
completely satisfactory. 

Our solution is to use total reflection [6]: we write a function is_tauto in 
Coq’s A-calculus that takes a Boolean expression be as input and returns whether 
the BDD for be is the distinguished 1 or not. We then prove the fundamental 
correctness lemma is_tauto_lemma, which states that if (is_tauto be) equals 
true, then be is valid. (The actual definitions are slightly more elaborate, see 
Section 4 for the precise formulation, implementation and proof strategies.) The 
task of any user wishing to show that a given Boolean expression be is valid will 
then be as follows: submit this claim to Coq, which will ask for a proof; then 
type Apply is_tauto_lemma: Coq now asks for a proof of (is_tauto be)=true. 
Finally, type Ref lexivity to instruct Coq that both sides of the equals sign in 
fact have the same value. Since they are not syntactically the same, Coq will 
start computing, and simplify both until either they are equal (in which case the 
proof succeeds) , or until they are not and no simplification is possible (then the 
proof fails). 

At this point, the proof is finished. It has been completely automated, it is 
small (the size of the generated proof-term is only that of the goal we proved 
plus some constant overhead, since computation steps are not recorded), and it 
is as safe as the rest of Coq (we never had to trust an external BDD package 
blindly, in particular). 

4 Implementing BDDs in Coq 

In this section we describe how we model BDDs in Coq and implement and 
prove the basic algorithms on them. We first describe the data structures used 
for modeling BDDs, for implementing sharing and memoization, and for garbage 
collection. We define the invariants of our representation, define the semantics of 
BDDs in terms of Boolean functions and prove the correctness of the algorithms 
in terms of this semantics. In this process we also prove the uniqueness of BDDs 
for our representation. The garbage collector is described in Section 5. Here we 
describe the main BDD algorithms, with the necessary assumptions about the 
garbage collector. Since the Coq formalization assures technical correctness of 
our results, our endeavor will be to convey the underlying intuition, with only 
a few actual Coq definitions shown. The formal definitions and details may be 
found in the complete Coq code and proofs. 

4.1 Representation 

We use a library of maps [18] to model a state i.e. the memory in which all 
the BDDs are to be stored, and also the sharing maps that support sharing 
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of BDDs. This library contains an implementation of finite sets and maps. 
The type (Map A) consists of maps denoting hnite functions from addresses 
(of type ad, which consists of binary representations of integers) to the set A. 
(MapGet ? m a) returns (NONE A) if the address a is not in the domain of the 
map m, or (SOME Ax) if m maps a to the element x (of type A). (Note the 
use of the question mark to abbreviate a type argument, here A, that the Coq 
type-checker is able to infer by itself. In the above case, (MapGet ? m a) is 
understood by Coq as (MapGet A m a) .) (NONE A) and (SOME A x) are the el- 
ements of type (option A). (MapPut ? m a x) changes the map m by mapping 
the address a to x in m. Equality of addresses is defined by ad_eq. Addresses can 
be converted to natural numbers (of type nat) by nat_of_ad. 

Definition BDDstate := (Map (BDDvar * ad * ad)). 

Definition BDDsharing_map ;= (Map (Map (Map ad))). 

A BDDstate maps addresses to triples consisting of a variable and two other 
addresses which represent the left and right sub-BDDs. Variables and addresses 
are both represented by the type ad (i.e. BDDvar = ad). The constants BDDzero 
and BDDone are the addresses zero and one and are used for the leaf nodes 0 and 
1 respectively. Having variables as integers gives us a natural ordering on the 
variables. It also makes it easy to define the sharing map by giving it the type 
(Map (Map (Map ad) ) ) (the domain of maps can only be addresses) . A sharing 
map represents the inverse function of a BDD state and (in curried form) maps 
left hand addresses to maps from right hand addresses to maps from variables 
to addresses. We also have maps for memoization. The memoization map for 
negation (of type BDDneg_memo, defined as (Map ad)) maps addresses to other 
addresses representing the negated BDD. Memoization maps for disjunction have 
the type BDDor_memo, defined as (Map (Map ad) ) . They map pairs of addresses 
to addresses. 

To implement allocation of new nodes and to do garbage collection, we also 
keep some other information in our configuration. We have a free list (of type 
BDDfree_list dehned as (list ad)) which contains addresses that are freed 
during garbage collection. We also have a counter (of type ad), that is an address 
beyond which all addresses are unused. The type BDDconfig consists of tuples 
with the six components described above. All the algorithms also take in as 
input a used list (of type (list ad)) which contains addresses representing 
BDDs which are used by the user, and is used to inform the garbage collector 
that these BDDs should not be freed. 



4.2 Invariants 

Next we dehne the invariants of our representation to ensure that firstly, they 
represent well formed BDDs (i.e. directed, acyclic graphs with leaf nodes 1 and 
0, with the reducedness and ordering condition), and secondly, the various com- 
ponents of the data structures are consistent with each other. 
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Our first invariant, (BDDbounded bs node n), states that all the nodes in 
the BDD rooted at node have variables less than n. It also states the fact that 
the BDDs are reduced and ordered: 

Inductive BDDbounded [bs :BDDstate] : ad -> BDDvar -> Prop := 

BDDbounded_0 : (n: BDDvar) (BDDbounded bs BDDzero n) 

! BDDbounded_l : (n:BDDvar) (BDDbounded bs BDDone n) 

I BDDbounded_2 : (nodeiad) (niBDDvar) (x:BDDvar) (l,r:ad) 

(MapGet ? bs node) = (SOME ? (x, (1, r) ) ) 

-> (BDDcompare x n)=INFERIEUR -> ~l=r -> (BDDbounded bs 1 x) -> (BDDbounded bs r x) 

-> (BDDbounded bs node n) . 

We shall constantly require our BDD nodes node to be bounded by n (in a 
given state bs), in the sense that they satisfy (BDDbounded bs node n) . For- 
mally, the definition above states that the set Bn of BDD nodes bounded by n 
is the smallest containing 0 and 1 (clauses BDDbounded_0 and BDDbounded_l), 
and such that if node points to (x, (1, r)) in bs, then node is in Bn pro- 
vided X < n and 1 and r are in Bx (ordering condition: this implies that all 
variables are less than n and in decreasing order along each path from the 
root), and provided 1 r (reducedness). The BDDbounded predicate makes no 
statement about canonicity of BDDs yet: that BDDs are canonical will follow 
not only from all BDDs being bounded, but also from other invariants involv- 
ing the well-formedness of the state bs as well as of the sharing maps (see 
Lemma BDDunique_l in Section 4.3). 

Observe that we have chosen to have the variables in decreasing order on 
all paths. This is contrary to usual convention. However it allows us to use 
the fact that < is well-founded on ad, so the conditions defined in BDDbounded 
additionally imply that any bounded BDD node is the root of a finite graph, 
which is therefore in particular also acyclic. Without BDDbounded, enforcing 
acyclicity would have been much trickier. 

This choice of ordering also gives a natural induction principle for doing all 
the proofs related to BDDs. In most paper-proofs of BDD algorithms, induction 
is done either on the structure of BDDs or on their sizes. But the structure of 
BDDs is not apparent to Coq — at least BDDs are not just inductive trees, and 
instead appear to Coq as a mesh of pointers — and size must itself be defined by 
induction over BDDs — so a more basic induction principle is needed to define 
size. Enforcing BDD nodes to be bounded by some number provides us with 
an easy way out: we just induct on a canonical number that is strictly larger 
than any variable in the given BDD. More precisely, the value that we use for 
induction in most of the proofs is bs_var’ (see below), which is one more than 
the variable stored at the root node of the BDD. In case of leaf nodes, it is zero, 
thus ensuring that bs_var’ strictly decreases from a node to its children, even 
if the children are leaf nodes: 

Definition bs_var’ := [bs : BDDstate ; node:ad] 

Cases (MapGet ? bs node) of 

NONE => ad_z (* leaf node; return ad_z (representing zero) *) 

I (SOME (x,(l,r))) => (ad_S x) (* ad_S is the successor function *) 

end. 



As far as the other needed invariants are concerned, let us mention the under- 
lying ideas, rather than display the (fairly straightforward) Coq definitions. The 




Reflecting BDDs in Coq 171 



predicate BDDstate_OK says that there is nothing stored in the BDD state at 
the addresses zero and one (since they are reserved for leaf nodes), and any node 
containing the variable n is bounded by n + 1. (i.e. satisfies the BDDbounded 
predicate). Sharing is ensured by the predicate BDDsharing_OK which says that 
the sharing map and the BDD state represent inverse functions. The predicate 
BDDfree_list_OK says that the free list stores exactly those nodes between 1 
and counter (excluding them) which are not in the domain of the BDD state; 
we also require that it does not contain duplicates, as this is needed for correct 
behavior. The predicate counter_0K means that the counter is strictly greater 
than one and there is nothing stored in the BDD state starting from this ad- 
dress. The predicates BDDneg_memo_OK and BDDor_memo_OK define the invariants 
on the memoization maps. A memoization map is OK if all entries refer to nodes 
that are OK, (a node is OK if it is in the domain of the BDD state or is a leaf 
node) and the interpretation of the nodes as Boolean functions (see Section 4.3) 
are related as expected. 

We have the predicate BDDconfig_OK for BDD configurations which encap- 
sulates the six invariants described above. 

Besides that, we have the predicate used_list_OK for the used list that is 
passed to all the functions as argument, which says that the used list should 
only contain nodes which are OK. 

4.3 Interpretation as Boolean Functions 

A Boolean function (of type bool_fun) maps environments to Booleans and an 
environment maps variables to Booleans. bool_f un_zero and bool_f un_one are 
the constant Boolean functions. Extensional equality is defined by bool_fun_eq. 
We also have the usual operations bool_fun_and, bool_fun_or, bool_fun_neg, 
bool_fun_impl, bool_fun_if f , bool_fun_if which are easy to define. A node 
node in the BDD state bs is interpreted as a Boolean function in the expected 
way by the following function, which we discuss below. 

Fixpoint bool_fun_of _BDD_1 [bs :BDDstate ; node: ad; bound :nat] : bool_fun := 

Cases bound of 0 => (♦ Error *) bool_fun_zero 

I (S bound’) => Cases (MapGet ? bs node) of 

NONE => (♦ leaf node *) if (ad_eq node BDDzero) then bool_fun_zero 
else bool_fun_one 

I (SOME (x,(l,r))) => (bool_fun_if x (bool_fun_of _BDD_1 bs r bound’) 

(bool_fun_of _BDD_1 bs 1 bound’)) 

end end . 



We come up against a difficulty while dehning this function, namely that 
Coq only allows recursive functions which use structural induction on the last 
argument — this is a design principle of Coq ensuring that Coq is strongly nor- 
malizing and consistent. We would indeed like to define bool_fun_of _BDD_1 as 
a function of just bs and node, by induction on the structure of the BDD stored 
at node. However we have no direct way of doing so since BDDs are not rep- 
resented as trees (since it would have prevented us from dealing with sharing). 
This problem is solved by supplying an extra bound bound of type nat. This 
argument is decreased by one in the recursive calls to the function, which makes 
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Coq happy. (This is sometimes known as “Boyer’s trick.”) We then supply a 
bound which is greater than the maximum depth of recursion we expect. This 
bound can be very easily computed from the variable that is stored at the root 
node (to be precise, it is one more than the quantity bs_var’). The semantics 
function bool_fun_of _BDD_bs (omitted here) is defined accordingly, using the 
above function. We also prove a theorem stating that the value returned by this 
function is independent of the bound passed as argument, provided that this 
bound is sufficiently large. We employ a similar tack in all the other functions 
(negation, disjunction etc.) where we need to recurse on the structure of the 
BDDs. 

We prove the uniqueness of BDDs for our representation in terms of the 
addresses where they are stored. The uniqueness lemma says that if the Boolean 
functions represented by two nodes in a BDD state are equal then the two nodes 
are equal (i.e. they are at the same address). In the statement of the lemma we 
have the extra argument n which is equal to the maximum of the bs_var’s of 
the two nodes. This then enables us to apply well founded induction on n. 

Lemma BDDunique_l : (bs rBDDstate ; share :BDDsharing_map) (BDDstate_OK bs) 

-> (BDDsharing_OK bs share) -> (n:nat) (nodel ,node2 : ad) 

n=(max (nat_of_ad (bs_var’ bs nodel)) (nat_of_ad (bs_var’ bs node2))) 

-> (node_OK bs nodel) -> (node_OK bs node2) 

-> (bool_fun_eq (bool_fun_of _BDD_bs bs nodel) (bool_fun_of _BDD_bs bs node2)) 

->nodel=node2 . 

The proof takes around 500 steps, and is along expected lines. 

Outline of proof: The cases where both nodel and node2 are leaf nodes are 
easy. If nodel is a leaf node, then node2 cannot be a non-leaf node because 
then both its children would have the same semantics, and consequently would 
be equal (from induction hypothesis,) violating the fact that BDDs are reduced 
(from the conditions in the BDDbounded predicate.) In case both nodel and 
node2 are non-leaf nodes, we first show that they contain the same variable, 
which requires us to separately prove a lemma stating that the Boolean function 
represented by a node in a configuration is independent of the variables greater 
than the variable at the root node. We then use the induction hypothesis to 
derive that the left children are equal and the right children are equal. Finally 
we apply the conditions in the BDDsharing_0K predicate to show that the two 
nodes are equal. □ 



4.4 Assumptions about the Garbage Collector 

As mentioned earlier, our implementation employs garbage collection. We fac- 
tor the verification into two parts - proving the BDD algorithms are correct 
assuming the garbage collector satisfies certain specifications, and then showing 
(in Section 5) that the garbage collector we implement satisfies these specifica- 
tions. This factoring also gives us the flexibility to change the garbage collector 
(like changing the conditions when memory should be freed) at some later time 
independently of the BDD algorithms. 

The specifications of the garbage collection function are: it takes in as input a 
BDD configuration and a used list and returns a new configuration. Its behavior 
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is specified using the predicate gc_0K, which says that the new configuration 
returned by the garbage collector is OK, the contents of all nodes reachable from 
the used list are preserved and no new nodes are added into the configuration. 

4.5 Memory Allocation 

The main function responsible for changing the store is BDDmake. All the BDD 
algorithms modify the store by calling this function. It takes in as input a variable 
X and two BDD nodes I and r and returns a node representing the Boolean 
function “if x then bfr else &/?”, where bfr and bfl are the Boolean functions 
represented by r and I respectively. It first checks whether such a node is already 
present in the configuration, and if not then it calls the function to allocate a 
new node. The allocation function, as well as all the other BDD algorithms are 
parameterized by a garbage collection function gc. The allocator first calls gc 
(which may or may not choose to free memory depending on the requirements), 
then allocates a node from the free list, if available, or otherwise it allocates the 
node at the address pointed to by the counter and the counter is incremented 
by one. 



4.6 BDD Algorithms 

We implement negation and disjunction as the basic operations on which the rest 
of the operations (conjunction, implication and double implication) are based. 
Thus we have two memoization tables in the configuration. The algorithms work 
by simple recursion on the structure of the BDDs. However the extra complexity 
is because of the fact that these algorithms are not purely functional in nature 
and need to modify the store. Each recursive call to the function returns a new 
configuration. Finally we call BDDmake which returns another configuration. 

We use the following function for negation. As described in Section 4.3, re- 
cursion on the structure of BDDs is accomplished by having an extra argument 
of bound type nat to bind the maximum depth of recursion, gc is the garbage 
collection function, ul is the used list and node is the node whose negation is to 
be computed. The function returns a new configuration and a new node in it rep- 
resenting the negated BDD. The computed result is memoized by the function 
BDDneg_memo_put . 

Fixpoint BDDneg_l [gc : (BDDconf ig->(list ad) ->BDDconf ig) ; cf g:BDDconf ig; 

ul:(list ad); node:ad; boundinat] : BDDconfig*ad := 

Cases bound of 0 => (* Error *) (initBDDconf ig,BDDzero) 

I (S bound’) => 

Cases (MapGet ? (negm_of_cfg cfg) node) of (* lookup memoization map *) 

(SOME node’) => (cfg,node’) (* return the node found *) 

I NONE => Cases (MapGet ? (bs_of_cfg cfg) node) of 

NONE => (♦ leaf node *) (if (ad_eq node BDDzero) then 

( (BDDneg_memo_put cfg BDDzero BDDone) ,BDDone) else 
( (BDDneg_memo_put cfg BDDone BDDzero) , BDDzero) ) 

I (SOME (x,(l,r))) => (♦ internal node: recursively compute negations of ♦) 

Cases (BDDneg_l cfg ul 1 bound’) of (cfgl,nodel) => (♦ child nodes *) 

Cases (BDDneg_l cfgl (cons nodel ul) r bound’) of (cfgr,noder) => 

Cases (BDDmake gc cfgr x nodel noder (cons noder (cons nodel ul))) of 
(cfg’, node’) => ( (BDDneg_memo_put cfg’ node node ’ ) ,node ’ ) 
end end end end end end. 
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Negation, as well as other algorithms, are computed by state threading. Re- 
cursive calls to the function, as well as calls to other functions return a new 
state. So the state needs to be passed around through this sequence of calls and 
the new state returned by each call is used in the next call. 



4.7 Proofs of the Algorithms 

We prove the expected semantics of BDDmake and the BDD algorithms in terms of 
Boolean functions. In addition, there are some other common things that need to 
be proved about all of them. Since all these functions change the store, we need 
to prove that the new configurations returned by them are OK. We also need to 
show that the new nodes returned are OK and the nodes in the old configuration 
that were reachable from the input list of nodes remain undisturbed in the new 
configuration (their semantics as Boolean functions remains unchanged). The 
assumptions are that the input configurations and nodes are OK. 

The proofs are straightforward but long. This is specially so in the case of 
negation and disjunction, where we prove all these invariants together since all 
of them are required together for the induction to go through. As an exam- 
ple, we give the correctness lemma for negation, BDDneg_l_lemma, below. It 
says that for any garbage collector gc satisfying gc_0K, and for any bound 
bound, conhguration cfg, used list ul and node node, if bound is greater 
than (bs_var’ node cfg), cfg is OK, ul is OK with respect to cfg and 
node is reachable from ul, then the new configuration new_cfg returned by 
(BDDneg_l gc cfg ul node bound) is OK, the new node new_node is OK in 
new_cfg, all nodes reachable from ul in cfg are preserved in new_cfg, the vari- 
able at new_node in new_cfg is same as that at node in cfg, and the semantics 
of new_node in new_cfg is negation of the semantics of node in cfg. 



Lemma BDDneg_l_lemma : (gc : (BDDconf ig->(list ad)->BDDconf ig)) (gc_OK gc) 

-> (bound:nat; cfgiBDDconfig; ul:(list ad); noderad) 

(It (nat_of_ad (var’ cfg node)) bound) -> (BDDconf ig_0K cfg) 

-> (used_list_OK cfg ul) -> (used_node’ cfg ul node) (* "node" is 

reachable from a node in "ul" or is a leaf node *) 

-> (BDDconf ig_OK (Fst (BDDneg_l gc cfg ul node bound))) 

/\ (conf ig_node_OK (Fst (BDDneg_l gc cfg ul node bound)) 

(Snd (BDDneg_l gc cfg ul node bound))) 

/\ (used_nodes_preserved cfg (Fst (BDDneg_l gc cfg ul node bound)) ul) 

(♦ all used nodes from old configuration are preserved in the new one *) 

/\ (ad_eq (var’ (Fst (BDDneg_l gc cfg ul node bound)) 

(Snd (BDDneg_l gc cfg ul node bound))) 

(var’ cfg node))=true (* variables at input and output nodes are same *) 
/\ (bool_fun_eq (bool_fun_of _BDD (Fst (BDDneg_l gc cfg ul node bound)) 

(Snd (BDDneg_l gc cfg ul node bound))) 
(bool_fun_neg (bool_fun_of _BDD gc cfg node))). 



This theorem is proved in around a thousand steps. The proof of disjunction 
is around three times larger. We prove the above lemma using induction on 
bound. It differs from normal paper-proofs of the same algorithms in that we 
don’t just have to reason about the BDDs on which the operations are applied, 
but about other parts of the store as well. We need to show that all the invariants 
are preserved in all the states and that other parts of the configuration remain 
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unchanged. Also, we require the condition that the bound bound, which binds 
the maximum depth of recursion, is sufficiently high. 

The main reason why these proofs are long is that we need to worry more 
about the store and less about the actual BDD on which we are computing. We 
require complex invariants to state that the configuration remains OK and that 
the store remains mostly unchanged by the operations. It might be interesting 
in the future to develop some general proof techniques which allow forgetting 
about the store and worrying only about the functional aspects of the programs. 

The proofs of the BDD algorithms finally allow us to implement a tautology 
checker is_tauto for Boolean expressions, parameterized by the garbage col- 
lector. It works by building a BDD for the given Boolean expression, using the 
various BDD algorithms, and then checking that the resulting node is the leaf 
node BDDone. We prove the following correctness lemma is_tauto_lemma for 
the tautology checker, which says that is_tauto returns true exactly when the 
Boolean expression represents the constantly true Boolean function. Boolean ex- 
pressions are defined using the inductive type bool_expr and are interpreted as 
Boolean functions by the function bool_fun_of _bool_expr in the obvious way. 
As usual, we have the assumption that the garbage collector passed as argument 
satisfies the predicate gc_0K. 

Lemma is_tauto_lemma : (gc : (BDDconf ig->(list ad)->BDDconf ig)) (gc_0K gc) 

-> (be :bool_expr) 

(is_tauto gc be)=true<->(bool_fun_eq bool_fun_one (bool_fun_of _bool_expr be)). 

A final note on implementing BDDs in Coq: it might seem preferable to use 
Typed Decision Graphs [5], a.k.a. complement edges [8], instead of plain BDDs. 
Their main value over plain BDDs is that negation operates in constant time. 
Here, it would allow us to dispense with the memoization table for negation, 
which would seem to simplify our proof work. However, disjunction is more 
complex to define and to reason about using complement edges. Moreover, the 
memoization map for disjunction, which mapped pairs of BDD nodes to BDD 
nodes in the plain BDD case, would map triples consisting of two BDD nodes 
plus a Boolean sign to pairs of BDD nodes and a Boolean sign: the latter would 
have to be defined under Coq as two memoization tables, one for the plus sign, 
one for the minus sign. This is clearly no advantage over having one memoization 
table for disjunction and one for negation. Still, the fact that negation is faster 
with complement edges means that it might be a good idea to implement and 
prove them, as well as any more elaborate BDD representation in the future. 



5 Garbage Collection 

The garbage collector implementation with which we instantiate our BDD im- 
plementation is a mark and sweep garbage collector in the style of [17]. Although 
reference counting is the usual choice in imperative frameworks, we believe that 
formalizing it in Coq is difficult, since its invariants seem more complex and 
a modular decomposition of its verification not entirely obvious to us. Our al- 
gorithm consists of three phases. In the mark phase, we mark all the nodes 
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reachable from the set of nodes in the used list using depth first search. In the 
first sweep phase, we remove all nodes from the BDD state which are not marked 
and add them to the free list. In the next sweep phase, we clean the sharing and 
memoization maps of all invalid references (because of nodes that have now been 
freed). 

Let us describe some Coq code from the mark phase: The following function 
add_used_nodes_l adds all the nodes reachable from a node node to another set 
of nodes marked. A set of nodes is implemented by the type (Map unit), unit 
being the trivial type with only one element tt. The function mark computes 
the set of all reachable nodes by iterating over each node in the input list of used 
nodes. 



Fixpoint add_used_nodes_l 

[bs :BDDstate ; node:ad; marked: (Map unit); bound:nat] : (Map unit) := 

Cases bound of 0 => (♦ Error *) (MO unit) 

I (S bound’) => Cases (MapGet ? marked node) of 

NONE => Cases (MapGet ? bs node) of NONE => marked 
I (SOME (x, (l,r))) => 

(MapPut ? (add_used_nodes_l bs r (add_used_nodes_l bs 1 marked bound’) bound’) node tt) 
end 

I (SOME tt) => marked 
end end. 

Definition add_used_nodes := [bs : BDDstate ; node:ad; marked: (Map unit)] 

(add_used_nodes_l bs node marked (S (nat_of_ad (bs_var’ bs node)))). 

Definition mark: = [bs :BDDstate ;used : (list ad) ] (f old_right (add_used_nodes bs) (MO unit) used). 
(* here fold right iterates the function add_used_nodes for each element in used *) 



The new BDD state is computed by restricting the domain of the original 
BDD state to the set of marked nodes. This can be done using the function 
MapDomRestrTo from the map library. 



Definition new_bs := [bs :BDDstate ; used: (list ad)] (MapDomRestrTo ? ? bs (mark bs used)). 



The new free list is computed by adding to the original free list, all the nodes 
from the old BDD state that are not marked. The sharing and memoization 
maps are cleaned by looking at each entry in them and keeping only those that 
refer only to marked nodes. Different functions are used for cleaning different 
memoization maps and sharing maps, e.g., a function clean’ 1 for the memo- 
ization map for negation which is of type (Map ad), and a function clean ’2 
(which calls the function clean’ 1) for the memoization map for disjunction (of 
type (Map (Map ad) ) ) . We omit the definitions of these functions here as they 
are quite long and fairly unreadable. 

We prove all the results about the garbage collector mentioned in Section 4.4. 
We show that after garbage collection, the resulting BDD configuration is OK, 
by showing that the new BDD state is OK (the reducedness, ordering condi- 
tions stated in BDDbounded are satisfied), and that the new free list, sharing and 
memoization maps are OK with respect to the new BDD state. We also show 
that addresses that were reachable from the input list of used nodes are left 
undisturbed (implying that the semantics of all the nodes in the list as Boolean 
functions remains unchanged). The proofs related to the functions clean’ 1, 
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clean ’2 mostly involve tedious arithmetical arguments related to the way ad- 
dresses are stored in the maps and require proving a series of lemmas to say 
that the new maps contain some entry if and only if it is present in the original 
maps and satisfies some other conditions notably that the nodes referred to are 
marked. The proofs have less to do with BDDs and more to do with the details 
of how maps have been implemented. 



6 Experimental Results 



n 


40 


80 


120 


160 


Coq Time 


166.57 


597.96 


2042.38 


2861.62 


Caml Time 


0.84 


3.94 


10.84 


15.68 


Speedup 


198 


152 


188 


182 


Coq Space 


33.1 


68.2 


39.5 


48.8 


Caml Space 


0.4 


4.1 


1.0 


0.9 


Sp. Saving 


31 


12 


19 


30.3 


#nodes 


1000 


10000 


2000 


5000 



Fig. 1. Urquhart’s formulae Un 



n 


1 


2 


3 


4 


5 


6 


7 


8 


9 


10 


Coq Time 


0.09 


0.83 


3.79 


13.43 


43.6 


129.91 


385.92 


- 


— 


— 


Caml Time 


0 


0.01 


0.02 


0.07 


0.3 


1.04 


3.55 


22.47 


33.17 


155.84 


Speedup 


- 


83 


190 


192 


145 


125 


109 


- 


- 


- 


Coq Space 


20.7 


21.6 


23 


27.7 


34.1 


49.5 


84.5 


- 


— 


— 


Caml Space 


36.10-® 


0.007 


0.056 


0.064 


0.059 


0.479 


2.23 


6.30 


16.78 


27.47 


Sp. Saving 


- 


131 


44.23 


111 


23 


60 


29 


- 


- 


- 


#nodes 


8 


70 


270 


854 


2498 


7006 


19030 


50000 


129162 


250000 



Fig. 2. Pigeonhole formulae 



We conducted several experiments to see whether this implementation is able 
to solve practical problems, how it scales up with problem complexity and the 
speed and space gained by extracting the implementation to OCaml and running 
the resulting, compiled programs. We give here the results for the following kinds 
of formulae. 
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— Urquhart’s [/-formulae [30,28] : [/„ is defined as x\^{x 2 ^ ■ ■ ■<=> {xn ^ {x\ 

(X2 (Xn—1 Xfi) • ■ •))) ■ • ■)■ 

— Pigeonhole formulae [28] : pign states that you cannot put n -I- 1 pigeons in 
n holes with no more than one pigeon per hole. 

— The 1990 IMEC benchmarks [31]: these are actual hardware verification 
benchmarks. We give the results for the benchmarks in the ex subdirectory: 
3 to 8-bit multipliers (mul03 through mulOS), 2 to 8 bit ripple adders (rip02 , 
rip04, rip06, ripOS) and others (ex2, transp, ztwaalfl, ztwaalf2) 
and in the plasco subdirectory: werner. These involve checking equivalence 
between a naive and a more refined implementation of a circuit. These tests 
were also used by Harrison [19]. 



pb. 


mul03 


o 
1— 1 

B 


nml05 


o 
1— 1 

B 


nml07 


00 

0 

I— 1 

1 


rip02 


rip04 


rip06 


ripOS 


Coq Time 


9.3 


64.91 


431.04 


- 


- 




1.05 


15.17 


103.25 


243.28 


Caml Time 


0.05 


0.31 


2.39 


15.77 


109.26 




0 


0.06 


0.43 


1.14 


Speedup 


186 


209 


180 


- 


- 




- 


253 


240 


213 


Anodes 


200 


1854 


8553 


35852 


100000 




72 


100 


100 


200 



pb. 


ex2 


transp 


ztwaalf 1 


ztwaalf 2 


werner 


Coq Time 


0.45 


0.37 


18.25 


12.98 


1.82 


Caml Time 


0 


0.01 


0.08 


0.06 


0 


Speedup 


- 


37 


228 


22 


- 


#nodes 


36 


22 


100 


200 


80 



Fig. 3. IMEC benchmarks 



Except for werner the other examples are tautologies. We measure the time 
and space requirement for building the HDD for a formula (the resulting node 
is 1 in case of tautologies). The policy we use is to allow more nodes to be 
allocated only if the garbage collector is unable to free any existing nodes. The 
behavior is not quite uniform with respect to the maximum number of nodes to 
be allocated because the garbage collector also requires space and so sometimes 
decreasing this number increases the total size requirements. We present here a 
few representative figures. Time is measured in seconds, and space in thousands 
of kilobytes. ’’Speedup” row gives the ratio of times takes by Coq and Caml for a 
formula. Space saving is computed as (a — b) jc where a is the size of Coq process 
after building the HDD, b is the size beforehand, and c is the size of OCaml data 
(found by (statO) .livewords). 

The extracted OCaml programs run around 150 times faster. Space savings 
are around 40-60. Eor pigeonhole formulae, seven pigeons is the limit for Coq and 
ten for the OCaml programs. Comparatively, a C version of the same algorithms 
is able to go up to twelve pigeons. For actual verification problems, even the Coq 
implementation runs in only a few seconds and difficult problems only require 
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a few minutes (except for the multipliers, which are well known to be too hard 
to solve with standard BDDs). The files containing the formulas for mul07 and 
mul08 are 607 and 789 lines long respectively. 

Although the Coq implementation is relatively slow and consumes a lot of 
memory, it is actually a pleasant surprise that such an implementation in an 
interpreted A-calculus atop a simulated store runs in an acceptable amount of 
time and memory at all. In fact, it works on examples of quite respectable sizes, 
including both hard problems like pigeonhole problems and real-life problems 
like the IMEC benchmarks. 

7 Conclusion 

The import of this work is 4-fold. First, to our knowledge, this is the first com- 
plete formal proof of BDD algorithms; thanks to the Coq extraction mechanism, 
this yields a certified implementation in OCaml. Secondly, the BDD algorithms 
have not just been modeled in Coq, but completely implemented in Coq itself 
viewed as a programming language. This allows us and other users to replace 
proofs of propositional formulae in Coq’s logic by mere computations in Coq’s 
A-calculus, and as such provides a seamless integration of BDD techniques with 
the Coq proof assistant. We plan to extend this to integrate a whole symbolic 
model checker inside Coq. Thirdly, contrary to first expectations, BDD algo- 
rithms implemented in Coq’s A-calculus atop a simulated store perform well in 
practice, even on hard and industrial problems. Finally, to our knowledge, this 
work is also the first to prove a garbage collection algorithm in full detail — 
what we prove is not just a model of a garbage collector, but an actual im- 
plementation that runs inside Coq itself. Furthermore, this algorithm is more 
complex than usual garbage collector algorithms in that it also reclaims space 
from memoization tables, which is not accounted for by the standard garbage 
collection algorithms but is needed in the context of BDDs. 
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Abstract. In this paper, we present a framework for specifying and ver- 
ifying an important class of hardware systems. These systems are build 
up from a parallel composition of circuits switching by a global clock. 
They can equivalently be characterised by Petri nets with a maximal 
step semanties. As a semantic model for these systems we introduce Dis- 
tributed Synchronous Transition Systems (DSTS) which are distributed 
transition systems with a global clock synchronising the executions of 
actions. We show the relations to asynchronous behaviour of distributed 
transition systems emplyoing Mazurkiewicz trace theory which allows a 
uniform treatment of synchronous as well as asynchronous executions. 
We introduce a process algebra like calculus for defining DSTS which we 
call Synchronous Process Systems. Furthermore, we present Foata Lin- 
eartime Temporal Logic (FLTL) which is a temporal logic with a flavour 
of LTL adapted for specifying properties of DSTS. Our important con- 
tributions are the developed decision procedures for satisfiability as well 
as model checking of FLTL formulas, both based on alternating Biichi 
automata. 



1 Introduction 

Many digital circuits, especially embedded controllers, can be modelled as tran- 
sition systems wrt. their logical behaviour. The controller is in one of finitely 
many states and executes one of its commands which we will call actions in this 
paper. The action modifies the current state transforming it into a new one. 
Usually, the executions are synchronised by a global clock or oscillator. Every 
time tick, an action takes place. ^ Several circuits or controllers for different tasks 
are combined on a switching board. The global clock synchronises the execution. 

However, the circuits have to coordinate their work. Therefore, they have 
to communicate. Typically, the circuits communicate among each other or to 
other resources like a shared memory via a common bus. To coordinate the 
access to this bus, an arbiter is employed. Every circuit has to ask the arbiter 
which grants the access to the bus. The circuits and the arbiter communicate 

^ Actions lasting for more than one tick can be modelled by a sequence of single-tick 
actions. 
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via common actions. Figure 1 shows an example of four circuits. Every circuit 
is connected by two wires to the arbiter, one is employed for requesting the bus 
(ri), the second one for granting it (gi). A request of Circuit 1 recognised by the 
arbiter can be modelled by the common action rl. If the arbiter is not in the 
state for receiving the request, Circuit 1 suspends. Note, that in every tick, the 
other circuits may execute independent actions.^ 




{rl, gl, ...,r4, g4) 



Fig. 1. Synchronised digital circuits 



We introduce Distributed Synchronous Transition Systems (DSTS) which are 
distributed transition systems with a global clock synchronising the executions 
of actions. They can be understood as a model for the parallel composition of 
hardware circuits as described above. 

Distributed Synchronous Transition Systems can also be interpreted as a 
model for the well-known Petri nets with a maximal step semantics [Rei86]. 
We assume that the reader is familiar with the notion of Petri nets and just 
give an example of a Petri net together with a maximal step run. Figure 2 
shows a place-transition-net. The maximal step semantics is obtained by the 
rule that all transitions which are capable of firing simultaneously, hre simul- 
taneously. A possible execution sequence for the presented Petri net would be 
{a, e}{b}{c}{d, f}{g}{h} where each set mentions the actions occurring concur- 
rently. 

Distributed transition systems (DTS) are a well-known model for distributed 
systems (cf. [Zie87, TH98]). However, usually they are considered with an asyn- 
chronous model of execution. The simultaneous execution of two independent 
actions a and b is modelled by the interleaving of a and b, i.e., first a and then 

^ The shown setup of the bus, circuits and the arbiter is described as a sample layout 
for the Motorola PowerPC where, for example, Circuit 1 to Circuit 3 are PowerPCs 
and Circuit 4 is a memory controller [Mot93]. 
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b as well as b and then a. In this way, concurrency is reduced to sequences and 
non-deterministic choice. Our approach is somehow dual. If two actions a and b 
can occur concurrently then we require them to occur concurrently and abstract 
from interleaving. 

Distributed transition systems can be treated formally by Mazurkiewicz trace 
theory [DR95]. We show that asynchronous executions of distributed transition 
systems are captured by the configuration graph of a trace and that the Foata 
configuration graph of a trace corresponds to one synchronous execution of the 
DTS. 

Considering the relations between asynchronous and synchronous behaviour 
of distributed transition systems, it turns out that a synchronous execution can 
be mapped to a set of asynchronous executions differing only between the in- 
terleaving of independent actions. However, not every asynchronous execution 
corresponds to a synchronous one. Hence, considering asynchronous executions 
yields a more abstract system. For model checking linear time specifications this 
provides the case that one might fail to prove a property although the underlying 
system fulfils the requested requirement. 

To simplify the task of defining DSTS, we introduce a simple calculus which 
we call Synchronous Process Systems and is inspired by Milner’s CCS [Mil89]. 
Synchronous Process Systems consist of a set of equations, each defining a single 
non-deterministic sequential process. The overall behaviour of the system is 
obtained by taking the concurrent product of the system where the execution, 
synchronisation, and communication is defined in the manner described above. 

Our main contribution is a logical approach for specifying properties for 
synchronous transition systems. We introduce Foata Lineartime Temporal Logic 
(FLTL) which is a temporal logic with a flavour of Lineartime Temporal Logic 
(LTL, [MP92]) adapted for specifying properties of DSTS. We give a decision 
procedure for FLTL as well as a model checking procedure, both based on al- 
ternating Biichi automata. It turns out that these procedures meet the known 
complexity bounds for LTL viz exponential in the length of the formula and 
linear in the size of the system and are essentially optimal. The model check- 
ing procedure employs an optimisation which is similar to a technique known 
as partial order reduction [Pel98]. However, instead of defining the interleaving 
product of the sequential processes and then trying to omit states not influencing 
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the result of the model checking procedure, we are able to define smaller systems 
directly because of the underlying model. 

Synchronous systems were studied by several authors. Milner defined a vari- 
ant of (asynchronous) CCS for synchronous systems (SCCS, [Mil83]). Lustre 
[CPHP87] is a programming language for synchronous systems. Usually, these 
contributions concentrate on the design of the underlying systems. The prob- 
lem of verification is tackled by the notion of bisimilarity or by theorem proving 
[BCPVD99]. Simple model-checking-based verification techniques are lacking. 
We present a simple model for synchronous hardware systems together with an 
implementation driven definition of a model checking algorithm. 

In the next section, we introduce basic notions of Mazurkiewicz trace theory. 
Section 3 presents alternating Biichi automata which will be our tool for the 
decision procedure. We carry on by defining distributed transition systems and 
a calculus for synchronous process systems. In Sections 6, 7 and 8 we define 
FLTL, give a decision procedure as well as a model checking procedure, resp. 
We draw our conclusions in Section 9. 



2 Preliminaries 

A (Mazurkiewicz) trace alphabet is a pair {S,I), where S, the alphabet, is a 
finite set and / C U x A is an irrefiexive and symmetric independence relation. 
(S,I) is sometimes also called independence alphabet. Usually, S consists of the 
actions performed by a distributed system while I captures a static notion of 
causal independence between actions. For the rest of the paper we fix a trace 
alphabet (A, I). We define D = {S x S) — I to he the dependency relation which 
is then reflexive and symmetric. For example (S,I) where S = {a,b,c,d} and 
I = {(a, d), {d, a), (6, c), (c, 6)} is an independence alphabet.^ 

A distributed alphabet S [PP95] is an n-tuple of alphabets (Ai, . . . , A„) (not 
necessarily disjoint). Let A = Ai U . . . U A„, Proc = {!,..., n}. For each a G A, 
denote by pr{a) the set {i G Proc \ a G Si}. For a distributed alphabet A we 
define the independence relation I{S) by /(A) = {(a, 6) G Ax A | pr{a)npr{b) = 

0 }- _ 

A distributed alphabet covers the notion of a system consisting of n compo- 
nents or processes. Actions occurring in the intersection of several components 
are used for communication: They must be executed concurrently. 

It is easy to see that (A, /(A)) is an independence alphabet. On the other 
hand, for an independence alphabet (A, J) we get a unique (up to the order of 
the alphabets) distributed alphabet A such that (A,/) = (A, /(A)) by consid- 
ering the maximal dependent subsets of A, i.e., the subsets Aj C A such that 
(a,b) ^ I for all a,b ^ Si. For example, the independence alphabet previously 
mentioned determines the distributed alphabet ({a, b}, {a, c}, {b, d}, {c, d}). Note 
that the maximal dependent subsets of an independence alphabet correspond to 
the maximal cliques when interpreting (A, D) as a graph. Abusing notation, we 
denote by /(A) also the set of all pairwise independent subsets of A. 

® We fix this alphabet for further examples. 
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Let T = {E,<,X) be a Z'-labelled poset. In other words, {E,<) is a poset 
and X : E ^ E is a labelling function. A can be extended to subsets of E in the 
expected manner. We will refer to members of E as events. For e £ E we define 
ie = {x e E \ X < e} and ^e = {x e E \ e < x}. We call | e the history of the 
event e. We also let < be the covering relation given by x < j/ iff a: < y and for 
a\\z^E,x<z<y implies x = z or z = y. Moreover, we let the concurrency 
relation be defined hy x co y iS x ^ y and y ^ x. 

A (Mazurkiewicz) trace (over (A,/)) is a A-labelled poset T = {E,<,X) 
satisfying: 

(Tl) Ve e E. [e is a finite set 

(T2) Ve,e' e E. e < e' implies A(e) D A(e'). 

(T3) Ve,e' e E. A(e) D A(e') implies e < e' or e' < e. 

We shall let TR{E, I) denote the class of traces over (A7, 1). As usual, a trace 
language L is a subset of traces, i.e. L C TR{E,I). Throughout the paper we 
will not distinguish between isomorphic elements in TR{S, I). A trace {E, <, A) 
is called finite iff E is finite. 

Let T = {E, <, A) be a trace over (A, I). A configuration of T is a finite subset 
of events c C E with jc = c where |c = (Jesc J-®- configurations of T 

will be denoted conf{T). Trivially, 0 G conf{T) is always the case. Furthermore, 
we define the c-suffix of T by T\c = {E — c, <\e-c^ A|£;_c). It is then not hard 
to see that T\c e TR{E, I) for any trace T e TR{E, I) and c G conf{T). For a 
configuration c let top{c) denote its maximal elements wrt. <. Every element of 
top{c) is called a top event or a top action (if the label of the event is considered). 

Moreover, conf{T) can be equipped with a natural transition relation — >t ^ 
conf{T) X S X conf{T) given by c-^tc' iff there exists an e G if such that 
A(e) = a, e ^ c and c' = c U {e}. 

Definition 1. Let min{T) = {e G if | e is minimal in T wrt. <}. The Foata 
configuration graph foata{T) = (C, C) for a trace T is the subgraph of conf{T) 
where C is the smallest set so that 0 G C and for every c £ C also min{T\c) G C 

The idea of the Foata configuration graph is to consider the configuration 
sequences in which every component has made an action if possible. Figure 3 
shows a trace, its configuration graph and its Foata configuration graph. ^ The 
formulas of the Foata temporal logic are interpreted wrt. Foata configurations 
of traces. 

A linearisation of a trace (£i, <, A) is a linearisation of the partial order, i.e., 
it is a total labelled order [E, <', A) such that < C <'. A trace is equal to the 
intersection of all its linearisations and given an independence alphabet (Af , I) 
a trace T is uniquely determined by one of its linearisations T' . In this case we 
call T the expansion of T' . A linearisation of a trace (if, <, A) represents a word 

Instead of showing the events of the configurations, we present their labels. The 
ordering of the events is clear by the independence alphabet and by the rule that 
dependent events increase from left to right. 
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Fig. 3. A trace, its configuration graph and its Foata configuration graph 



by considering the sequence of labels. Furthermore, all linearisations of a trace 
form an equivalence class. 

An uj -linearisation of a trace (E, <, A) is a linearisation {E, <', A) such that 
every event e ^ E has a finite history, i.e., Ve £ if | i e| < oo. An o;-linearisation 
of a trace corresponds to an w-word and all w-linearisations of a trace form an 
equivalence class in 

Definition 2. A Foata linearisation of a trace {E, <, A) is a linearisation {E, <' 
, A) which can he written as a product of finite traces 

(E,<',A)= n {E,,<f\i) 

such that for every i > 1 we have <'i=<'\Ei, \ = Aj^;^ and Ei is a set of pairwise 
independent actions, i.e., Xi{Ei) C I{E). Furthermore, for each i > 1 and every 
e £ Ei+i there is an e' £ Ei such that (A(e),A(e')) £ D. Note that the product 
of finite traces is defined in the canonical way: {E, <, \){E' , <', A') = (ED E' , < 
U <' U{(e, e') £ X I (A(e), A'(eO) £ D}, A U A'). 

A Foata linearisation is an ta-linearisation of a trace and corresponds to an 
w-word in Foata normal form which is defined similarly [DM96] : A word w £ 
is in Foata normal form iff 

1. W = W\W2 . ■ . 

2. for each i > 1 the word Wi is a product of pairwise independent actions 

3. for each i > 1 and for each letter a of iCj+i there exist a letter b in Wi which 

is dependent on a {{a, b) £ D). 

The words Wi are called steps. It is easy to see that foi w = wiW 2 ■ . ■ in Foata 
normal form and for every i the suffix rcirci+i ... is in Foata normal form. The 
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steps correspond to the top actions of every configuration in the Foata configu- 
ration graph of a trace. For example, a Foata linearisation of the trace shown in 
Figure 3 is {a) {be) {ad) (be) where the steps are accentuated by parenthesises. 

Note that the Foata normal form as defined here is unique up to permutation 
of independent actions within a step. Given a total oder -< for the actions of S 
and requiring the steps to be minimal wrt. the lexicographic order derived from 
^ results in a unique Foata normal form for every trace. Such a word can also 
be considered as a word over I{S). However, not every word of I{S) is in Foata 
normal form. For example, {b){bc) can be considered as a word of d{S) but its 
Foata normal form (as a word over U) would be {bc){b). 

3 Alternating Biichi Automata 

Nondeterminism gives an automaton the power of existential choices: A word w 
is accepted by an automaton iff there exists an accepting run on w. Alternation 
gives a machine the power of universal choices and was studied in [BL80, CKS81] 
(in the context of automata). In this section, we recall the notion of alternating 
automata along the lines of [Var96] where alternating Biichi automata are used 
for model checking LTL. For an introduction to Biichi automata we refer to 
[Tho90]. 

For a finite set X of variables let {X ) be the set of positive Boolean for- 
mulas over X, i.e., the smallest set such that 

~ A C B+{X) 

— true, false £ B'^{X) 

- ifi.-ip e B+{X) ^ ip Af; e B+{X), p w ij; e B+ {X) 

The dual of a formula p £ B+(A) denoted by p is the formula where false 
is replaced by true, true by false, V by A and A by V. 

We say that a set T C X satisfies a formula <p <A B^{X) {Y ^ p) iS. <p 
evaluates to true when the variables in Y are assigned to true and the members 
of X\Y are assigned to false. For example, {<?i, ga} as well as {gi, 94} satisfy the 
formula (gi V g2) A (ga V g4). 

Let us consider a Biichi automaton. For a state q of the automaton and an 
action a let {gi, . . . ,qk} = {q' \ q ^ q'} be the set of possible next states for 
{q,a). The key idea for alternation is to describe the nondeterminism by the 
formula gi V • ■ • V g^ £ B^{Q). Hence, we write g gi V • ■ • V g^. If fc = 0 we 
write g A false. An alternation is introduced by allowing an arbitrary formula 
of B+(Q). Let us be more precise: 

An Alternating Biichi Automaton (ABA) over an alphabet A is a tuple A = 
(( 5 , 5 , go, -F) such that (5 is a finite nonempty set of states, go £ (5 is the initial 
state, T C (5 is a set of accepting states and 5 : (5 x A ^ B^{Q) is the 
transition function. 

Because of universal quantification a run is no longer a sequence but a tree. 
A Q -labelled tree r is a pair {t,T) such that t is a tree and T : nodes {t) Q. To 
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simplify the presentation, we let nodes (t) be implicitly defined and refer to its 
elements in the following canonical way: The root is denoted by e and if a node 
s is denoted by w then a child s' labelled with g by T is denoted by wq. Since 
for our needs we can restrict T to be one-to-one for children of the same parent 
this is well-defined. 

For a node s let |s| denote its height, i.e., |£| = 0, \wq\ = |u;| + 1. A branch 
of T is a maximal sequence (} = sq, si, . . . of nodes of r such that sq is the root 
of T and Si is the father of Sj+i, i e IN. The word induced by j3 (for short /3’s 
word) is the sequence of labels of (3, i.e., the sequence T(so), T(si), . . . 

A run of an alternating BA A = {Q, 5, go, iF) on a word w = a^ai ... is a 
(possibly infinite) Q-labelled tree t such that T{e) = go and the following holds: 

if X is a node with \x\ = i, T{x) = g and 5{q,ai) = tp then either p e 
{true, false} and x has no children or x has k children x\, . . . ,Xk for some 
k < IQI and {T(xi), . . . ,T{xk)} satisfies p. 

The run r is accepting if every finite branch ends on true (i.e., d(T{x), ai) = true 
where x denotes the maximum element of the branch wrt. the height and i 
denotes its height) and every infinite branch of r hits an element of T infinitively 
often. 

It is obvious that every Biichi automaton can be turned into an equivalent 
(wrt. the accepted language) alternating Biichi automaton in the way described 
above. The converse is also true and is described for example in [Var96]. The 
construction involves an exponential blow up. Hence, it is easy to see that the 
emptiness problem for ABAs is exponential in the number of states. 



4 Distributed Synchronous Transition Systems (DSTS) 



We introduce a formal model for the underlying concurrent systems. Distributed 
Synchronous Transition Systems. It is based on Zielonka’s asynchronous au- 
tomata (without final states, [Zie87]) or the notion distributed transitions sys- 
tems (described for example in [TH98]).® However, the definition of a run is 
modified to reflect the idea of a global synchronising clock. Strictly speaking, we 
only define Distributed Transition Systems (DTS) as well as their synchronous 
and asynchronous runs. However, to denote the context we speak of either syn- 
chronous or asynchronous DTS. 

Definition 3. A Distributed Transition System (DTS) over a distributed alpha- 
bet S is a tuple A = {Qi , . . . , Qn, — *-,T) where 

— Each Qi is a finite nonempty set o/ local states of the ith component. 

— Let Q = YiieProcQi o/ global states and J C Q be the set of 

initial states. 

® Our presentation of DTS is inspired by [PP95] . 
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— Let States = YiieProdQi {“})• dummy — is used as a placeholder in 
components which have no significance for the transition: — > C States x 
S X States is a transition relation satisfying the following condition: 

if (q, a, q') £ — > then q[i] = q'[i] = — for i £ Proc\pr{a) 
where q[i] £ Qi denotes the ith component of q. 

The dummy — in the definition of the transition relation — > is used for 
denoting components of a global state which are not affected by the transition. 
Given a state q = (qi, . . . , g„) e Q we denote by q\M the element (gj, . . . , qlf) £ 
States such that qi = ql for i e M and q'l = — else. 

Definition 4. A synchronous execution p of a DTS is an infinite sequence 
qiAiq 2 ■ ■ ■ of global states and sets of pairwise independent actions which sat- 
isfies the following conditions: 

— qi £ T, i.e., qi is an initial state. 

— For ] > 1 and all a £ Aj, {qj\pr{a),a,qj+i\pr(a)) e — > and qj\p = qj+i\p 
for P = Proc\ Uag A, pr{a). Hence, a transition is the “parallel” execution 
of concurrent actions according to the local transition rules. 

— Furthermore, Aj must be maximal in the following sense: For every j > 1, 
for all A'j £ I{S) with Al D Aj such that there is a q'j_^_i with qjAjqj_^_l 
we have Al = Aj . This ensures that all components being able to do a local 
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Fig. 4. A distributed transition system and a synchronons execution 



Abusing notation we call a DTS also a Distributed Synchronous Transition 
System if we consider its synchronous executions. Figure 4 shows a graphical 
representation of a DTS and (a part of) one of its execntions. The dotted line 
describes the initial state of the system. The crucial point of the execution is 
that the actions b and c must occur synchronously. 
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For a distributed transition system, we also define the notion of an asyn- 
chronous execution which is obtained by sequentialising and interleaving transi- 
tions. 

Definition 5. An asynchronous execution p of a DTS is an infinite sequence 
qiaiq2 ■■■ of global states qj and actions aj which satisfies the following condi- 
tions: 



— qi £ T, i.e., q\ is an initial state. 

- For every j > I, we have (gj|pr(aH>«i>9i-Klpr(aH) qj\proc\pr(a^) = 

qj+i\proc\pr(aj)- The triple {qj^aj^qj+i) is called the jth state of p and is 
abbreviated by p{j). 

In the same manner we call a distributed transition system also a distributed 
asynchronous transition system when considering its asynchronous executions. 
For the distributed transition system presented in Figure 4 

(911 ) 921, 931 j 941 ) a (912,922,931,941) b (911,922,932,941) c 
(911,921,932,942) d (911,921,931,941) 

would be a (part of) an asynchronous run. 

A synchronous execution can be mapped to a Foata configuration graph in 
the obvious way. Furthermore, given a synchronous execution an asynchronous 
execution can be obtained by interleaving each synchronous execution of actions 
Aj which we call interleaving of a synchronous run. However, it is an easy ex- 
ercise to see that not every asynchronous run can be obtained by interleaving a 
synchronous one. 

Theorem 1. For distributed transition systems the class of (interleaved) syn- 
chronous executions is strictly contained in the class of asynchronous executions. 

Proof. Consider the system depicted in Figure 5. For every asynchronous exe- 
cution corresponding to a synchronous one, there is an action d between two 
actions a. This does not hold for every asynchronous run. 

For model checking, the previous theorem implies that considering the more 
abstract asynchronous behaviour of a hardware system could yield false evidence. 

5 A calculus for DSTS 

In this section, we introduce the process calculus Synchronous Process System 
(SPS) which may be employed to define a distributed synchronous transition 
system. Within the area of verification, a distributed system is preferably given 
in terms of such a calculus instead of directly presenting the automaton. 

Let F = {nif°\ be a ranked alphabet, S a finite set of nullary 

actions and P a variable. The set of sequential process terms SPT{S,P) is 
inductively defined by 
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{a, 6} {a,c} {b,d} {c,d} 




Fig. 5. Asynchronous runs and synchronous runs differ 



- P,nil e SPT{S,P) 

^1,^2 ^ S PP{PJ ^ P\ CL ^ A (xd\^t\ -\- 12 ^ SPP ( A, F) 

A process definition over (S,P) is an equation P = T{P) where T{P) is 
a sequential process term over {S,P). A synchronous process system over a 
distributed alphabet S = {Si , . . . , A„) and a finite set of process variables V — 
{P, Pi,. . . , Pn} is a set of equations (i € n}) 



{F = Pi II ... II P„, P = ti} 



where ti e SPT{Si, Pi) and || is an n-ary (parallel) operator. 

The semantics of a synchronous process system is defined in two steps. The 
semantics of a process definition P = t is a (finite) transition system {S, ^) 
where S = P U Sub{t)^ and ^ : S x S x S is a labelled transition relation 
defined by the following inference rules 



a.ti 



a 



tl 



ti ^ t'l 
tl+t2^ t'l 



t 

P 



t' 



t' 



(P=t) 



t2^t'^ 
tl t2 — > t '2 



The semantics of a process system V = {P = Pi || ... || Pn, Pi = ti} 
is a distributed synchronous transition system defined in the following way: 
For Pi let {Si, — >i) be its semantics. Let — >= {(gi, . . . , Pn, a,,q'i, . . . , q'n) \ Vi G 
pr{a) {qi,a,q'i) G — and Vi G Proc\pr{a) qi = q'i = — }. The distributed transi- 
tion system for V is (5’i, . . . , {Pi,..., Pn)) 

A drawback of our calculus is that it is not expressive complete with respect 
to DSTS. It is an easy exercise to see that our calculus cannot define the trace 
language S*acS* over the alphabet S = {a, b, c} where the only independent 
actions are a and c. On the contrary, this is simple when DSTS are considered. 



Lemma 1. The class of languages definable by SPS is strictly contained in the 
class of languages definable by DSTS. 



® Sub{t) denotes the set of subterms of t defined in the usual way. 
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For completeness, we mention that it is easy to see that SPS define so-called 
regular product languages while DSTS define regular trace languages [Thi95]. 

6 Foata Linear Time Temporal Logic (FLTL) 

In this section, we introduce Foata Linear Time Temporal Logic (FLTL) which 
is patterned after LTL and may be used to specify the behaviour of a distributed 
synchronous transition system. A crucial difference to LTL is that (independent) 
sets of actions may be employed to define atomic steps of a DSTS. 

Let (A',/) be an independence alphabet and /(A) the set of pairwise inde- 
pendent subsets of A. FLTL(A,/) is the least set of formulas that satisfies for 
all FLTL(A,/): 

tt G FLTL(A, I) {A)p G FLTL(A, I) 

-^ip G FLTL(A, I) Oifi e FLTL(A, I) 

ipAilje FLTL(A, /) ifiUilj G FLTL(A, I) 

where A G /(A). 

Let T be a trace over (A, I) and (foata{T) , its Foata configuration graph. 

The satisfaction relation for a formula (f G FLTL(A, I) wrt. a Foata configuration 
c of T is inductively defined by 

— r, c 1= tt, 

— T,c\=^(fiAAT,c^(p, 

— T,c \= if A Ip iS T,c \= ip and T,c \= ip, 

— T,c\= {A)p iff there exists an A' G /(A), A' D A and c' G (foata{T),^) 

such that c ^ c' and T,c' \= (p where c ^ c' iff c, c' G foata{T), A(c' \c) = A' 
and c' \ c e /(A), 

— T,c \= Op> iff there exists an A G /(A) and c' G {foata{T),^) such that 
c ^ c' and T, c' \= ip, 

— T,c\= pUip iff there exists c' G {foata{T), ^), c' A c such that T,c' ^ ip 
and for all c" G {foata{T), -i-), c C c" C c' T, c" ^ ip. 

For formulas of the kind {A)tp, we require a superset A' of A to exist for 
transforming the system from configuration c to c' . This simplifies the task of 
specification since the user only has to specify the actions he or she wants to see 
while leaving the atomic actions of the components not involved by actions of 
A unspecified. If we change our semantics in the way that exactly the actions 
specified must be employed to move from configuration c — > c' , we can trans- 
form every formula of our logic into this logic by taking any combination of the 
remaining actions. However, in general this causes an exponential blow up of our 
formula augmenting the overall complexity of deciding and model checking. It 
is an easy exercise to enrich our logic and algorithms by additional operators 
requiring A' to be a subset of A or to be equal to A without increasing its 
complexity. 

As usual, we introduce the abbreviations of the following kind to simplify 
the task of specifying properties: 
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— (fiV Ip ioT A -!')/») 

— <>ip for ttUif 

— dip for 

Hence, it is possible to express global liveness and safety properties in a 
manner as known from LTL (see [MP92] for an introduction to specification via 
LTL). 

Our logic can be understood as LTL over the alphabet I{S) with the excep- 
tion of the different interpretation of the next-state operator. Then one might 
think of employing the standard LTL algorithms for deciding FLTL. However, 
since not every word in I(S)* is in Foata normal form, the standard algorithms 
for deciding and model checking algorithm for LTL have to be modified to con- 
sider only models in Foata normal form. Furthermore, the algorithms have to be 
modihed to respect our special form of the next-state operator(s). Concerning 
model checking, the models to analyse are given by distributed transition sys- 
tems over the alphabet S. To employ a logic over I{S), the transition systems 
have to be transformed into a single hisimilar one over I{S). For practical rea- 
sons, this has to be carried out on-the-fly. Altogether, we are convinced that 
understanding FLTL as a logic over I{S) is theoretically more elegant but is the 
second choice for practical algorithms. We therefore directly formulate a deci- 
sion procedure and a model checking algorithm for S since this yields a more 
efficient practical implementation. However, the presentation is a little bit more 
technically involved. 

7 Deciding FLTL 

We now present a decision procedure for FLTL formulas by means of alternating 
Biichi automata. Given a formula ip e FLTL(A, I), we define an automaton 
that for all Foata linearisations w accepts w if and only if the expansion T of w 
satisfies <p. 

As a second step, we will define a Biichi automaton accepting a word 
in iff it is in Foata normal form. Hence, the language of the automaton 
accepting the intersection of the languages A^p and At is non-empty if and only 
if p> is satisfiable. 

Definition 6. Let p G FLTL(Z', /). Then A^p = {Q, S,6,qo, F) is defined by 
Q = /(A) X {Sub{p)U^Sub{p)) , qq = (0, p) and 6 : /(A) xQx A ^ (A) xQ) 

by 

{S, tt, a) true 

( 5 , Ip /\r],a) S{S, ip, a) A 6{S, rj, a) 

{S, -^ip,a,) ^ ^S{S,tp,a) 

( S{0,ip,a) ifaDS,A C S 

(S, {A)ip, a) < false if aDS, A (f. S 

\ (S' U {a}, (A)V^) ifalS 
f(5(0,V',a) if aDS 
( (S U {a}, O'!/;) ifalS 
5{9,r] V {ip A 0{ipUrj)),a) 



{S,Oip,a) 

{iP,'ipUri,a) 
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The set of final states is, as usual, given by the states with negative formulas, 
F = I{S) X : ip e Sub{p)}. 

Given a linearisation of a trace in Foata normal form, the different steps are 
characterised by actions dependent on the preceeding step. Hence, the transition 
function of the automaton Aip for a formula p € FLTL collects the independent 
actions of the current step and checks the formula as soon as a dependent action 
occurs, until formulas are directly unwound according to the equivalence pUip = 
tpy {p f\ 0{pUf})). Hence, the transition function S just treats the situation for 
an empty step. 

Every finite branch of a run of Aip ending in true gives a proof for our 
formula. Infinite branches only occur by infinitely often unwinding until formulas. 
Hence, they must be accepted iff the until formula is negated. This shows the 
correctness of our construction. 

Now, we define an automaton Aj: accepting Foata linearisations of traces. It 
can be understood as a kind of filter, rejecting w-words which cannot be a Foata 
linearisation of a trace. The intersection of Aj: and Aip is the automaton to be 
checked for (non-)emptiness to decide the satisfiability of p. 

Definition 7. Aj^ = {Q,S,S,qo,F) is defined by Q = 2^ x I{S), qq = (if, 0), 
F = Q and S : Q x S ^ 2^^ by (G, S,a) $ if a ^ G else, if a E G then 
(G,S,a) ^ {(G'jS'')} where G' and S' are defined in the following way: ifalS 
then S' = SU {a}, G' = G and if aDS then S' = {a}, G' = D{S) = {b e S : 
3c E S bDc}. 

According to the definition of the Foata normal form (see Section 2) a word 
is in Foata normal form if it can be written as product of steps. A step is a word 
of pairwise independent letters. Furthermore, for every step (excluding the first 
one) there is a dependent action in the previous one. Ajr reads a word step by 
step. A part of a step is stored in S. An action independent on actions of the 
current step must belong to the current step. Hence it is added to S. As soon as 
an action is read which is dependent on one of the actions of the current step it 
must be part of the next step which is initialised by this action. Furthermore, to 
reflect the second requirement for the steps, we store in G the actions dependent 
on S. These are the good actions which we allow to be read from now on. This 
ensures that all actions read from now on are dependent on the previous step 
(because of (G, 5, a) 0 if a ^ G). 

Complexity It is easy to see that for p E FLTL the size of Aip is linear in the 
size of p. Hence, the size of the resulting Biichi automaton is exponential in the 
size of p. Aj:r is independent of p, so is its size. Hence, deciding whether there 
is a model for p is exponential in its length. This is optimal since for the empty 
independence relation, we are in the situation of LTL. 

8 Model Checking for DSTS and FLTL 

In this section, we present a model checking algorithm for FLTL wrt. the exe- 
cutions of a distributed synchronous transition system. Given a DSTS A and a 




196 



Martin Leucker 



formula tp, we construct a Biichi automaton accepting for every synchronous 
execution of ^ a single asynchronous one. In contrast to accepting every asyn- 
chronous execution, this reduces the number of possible transitions and, more 
important, the number of reachable states. For the negation of p, we construct an 
ABA and transform it into a Biichi automaton as described in the previous 
section. Testing the intersection of and B^^p for emptiness answers whether 
there is an execution of A violating p. 

Definition 8. Let A = {Qi , . . . , — >,T) be a DSTS. Then let B^ = {Q, d,lx 

{0},F) be the BA defined by Q = Qi x ■ ■ ■ x Qn x I{S) and F = Q x I{S), i.e., 
every state is also a final state. Fix a linear order -< on the alphabet We call 
an action a enabled in q iff there is a q' & Q such that (g|pr(a)j (j'\pr(a)) S — 

Let{{q,S),a,{q',S'))e6 iff 

(9lpr(a)j 9^|pr(a)) S — >> i-s-, it is o Valid transition according to the under- 

lying DTS, and 

2. if aDS then 

(a) {6 e A I bis and b enabled in if} = 0, i.e., there is no action indepen- 
dent to the current step left for execution, and 

(b) a is strictly smaller than each element of the set {6 G A | bla and 
b enabled in q} wrt. -<, S' = {a} 

else 

3. ifalS then a is strictly smaller than each element o/{6 G A | bla, bIS and b 

enabled in g} wrt. -<, S' = S U {a}. 

Item 2. (a) guarantees that the next step is considered only if the current one 
is “full”. Items 2.(b) and 3 handle the selection of “equivalent” transitions. The 
rule is: Just let the smallest action wrt. ^ make a transition. The new current 
step S' is treated as in the definition of Aj^ (see Section 7). Note that we do 
not have to concentrate on good actions since our selection strategy ensures to 
fill a step before considering the next one. This shows that By^ accepts for every 
synchronous execution of A exactly one linearisation. For example, Bj\, for A as 
in Figure 4 is shown in Figure 6 

Complexity Model Checking is exponential in the size of the formula and linear in 
the size of By^. The size of By^ is exponential in the size number of components of 
A. The experiences gained by partial order reduction [Pel98] allow the conclusion 
that the number of reachable states is in the average case much smaller. 

9 Conclusion 

In this paper we presented a framework for verifying synchronous hardware sys- 
tems. We introduced a suitable semantic model. Distributed Synchronous Tran- 
sition Systems (DSTS), which are distributed transition systems with a global 

^ Note that it suffices to define -< of pairs of independent actions only. Hence, 
Ai, . . . , An induces an appropriate -<. 
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Fig. 6. A Biichi automaton for the DSTS shown in Figure 4 

clock synchronising the executions of actions. Considering a sample layout for 
PowerPC systems, we proved that our approach is adequate for hardware sys- 
tems. We also presented a characterisation by the maximal step semantics for 
Petri nets. We pointed out that, considering asynchronous executions of the un- 
derlying system instead of its synchronous behaviour, one might fail to prove 
some of its properties. We explained that executions of distributed transition 
systems can be described within Mazurkiewicz trace theory, especially by Foata 
configuration graphs of traces. To enrich our approach, we introduced the calcu- 
lus Synchronous Process Systems simplifying the task of defining DSTS. 

The main advantage of our approach is the support by temporal logic specifi- 
cations and automatic decision procedures for satisfiability and model checking. 
We defined Foata Lineartime Temporal Logic (FLTL) which is a temporal logic 
with a flavour of LTL adapted for specifying properties of DSTS. We developed 
a decision procedure for satisfiability as well as a model checking FLTL speci- 
fications, both based on alternating Biichi automata. As future work, we plan 
to integrate this model together with FLTL in the verification platform Truth 
[LLNT99]. 
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Abstract. We present a Control Flow Analysis (CFA) for the Safe Am- 
bients, a variant of the calculus of Mobile Ambients. The analysis refines 
[12] and computes an approximation of the run-time topology of pro- 
cesses. We use the result of the analysis to establish a secrecy property. 



1 Introduction 

Mobile Ambients (MA [6]) has recently emerged as a core programming language 
for the Web and, at the same time, as a model for reasoning about properties of 
mobile processes. Differently from other process algebras based on name commu- 
nication like the yr-calculus [II], MA is based on the notion of ambient. An ambi- 
ent is a bounded place, where multi-threaded computation happens; in a sense, 
it generalizes both the idea of agent and the idea of location. Each ambient has 
a name, a collection of local processes and a collection of subambients. Ambients 
are organized in a hierarchy that can be dynamically modified, according to three 
basic capabilities, associated with ambient names and used for access control. 
They are the following: inn allows an ambient to enter into ambient (named) 
n: (m[inn. Pi | P2] j n[Q] — > n[m[Pi j P2] j Q]); out n allows an ambient to 
exit from ambient n: (n[m[outn. Pi j P2 ] | Q] — > m[Pi | P2] j n[Q]); openn 
allows to destroy the boundary of ambient n: (openn. P | n[Q] — > P | Q). As 
shown by the rules, an ambient m moves as a whole with all its subambients. The 
movements of m only depend on the capabilities exercised by the local processes 
contained at its top level (for instance inn. Pi in the first rule); instead the local 
processes of subambients of m cannot directly influence the movements of their 
parent m. Also, the ambient n, that is affected by the rules, has no control on 
whether or not the action takes place. 

Safe Ambients (SA [10]) is a modification of MA, where a movement or 
an ambient dissolution can take place only when the affected ambient agrees, 
offering the corresponding coaction: in n, out n, open n. This variation does not 
change the expressiveness of MA, yet makes it easier both to write programs and 
to formally prove their correctness, especially by using behavioural equivalences. 

Several techniques, both dynamic and static, have been devised to study and 
establish various security properties of mobile calculi, based on notions of classi- 
fications and information flow. Encouraging results have been obtained by the use 
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of static approaches, such as Type Systems [10,4,3,5] and Control Flow Analysis 
(CFA) [12,13,14]. These techniques predict safe and computable approximations 
to the set of values or behaviours arising dynamically. For example, in calculi for 
concurrency, the approximations concern the values that variables may assume 
at run-time or the values that may flow on channels. In the ambient calculi, 
particularly relevant for security issues is information about the dynamic evolu- 
tion of the ambient hierarchy and about which and where capabilities may be 
exercised. 

The CFA given in [12] computes an approximation of the (run-time) topo- 
logical structure of MA processes and follows the lines of previous work on the 
TT-calculus [1]. More in detail, the analysis predicts for each ambient n, which 
ambients and capabilities (say a) may be contained at top level inside n. The 
analysis has been applied to prove a security property of the firewall protocol 
from [6,7]. 

In this paper, we refine the analysis of [12], adapted to SA, by introducing a 
sort of contextual information, extending the proposal made in [9] . The analysis 
predicts for each ambient n: 

1. which ambient surrounds ambient n whenever a (capability or ambient) is 
contained inside at top level; 

2. which ambient surrounds n whenever a (capability or ambient), besides be- 
ing contained inside (at top level), is also ready to interact. 

Such a contextual information allows to considerably restrict the space of possible 
movements. In this way we gain in precision, and thus we may consider more 
properties and prove statically more programs safe. 

As a simple example, we apply the result of the analysis for proving a secrecy 
property. We classify ambients into trustworthy and untrustworthy. Secrecy of 
data is preserved if an untrustworthy ambient can never open a trustworthy one. 
We show that if a program passes a simple static test, then its secret information 
is dynamically protected. 

In proving a security property it is essential to be able to guarantee that 
properties are preserved when programs work in unknown, possibly hostile con- 
texts. In general, not all contexts preserve our property, but we can say which 
of them do. As expected, the needed restrictions concern the usage of names. 
Actually, some of them cannot occur as ambient names in the context, but can 
occur inside its capabilities (cf. the assumption made in [12]). We define a tester 
process E that represents the most hostile context matching these requirements. 
If a process P in parallel with E passes the static test, then the secrecy property 
holds dynamically in any context represented by E. 

Due to lack of space, we omit all the proofs. 

2 Mobile Safe Ambients 

In this section we briefly recall the Mobile Safe Ambients ([10]) calculus, without 
communication primitives. The only difference with the syntax of MA is given 
by coactions inn, out n, openn. 
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Because of a peculiar treatment of bound names (see below), we partition 
names as follows. Given J\f = {n, h,k, . . .} the infinite set of names, let Af' = 
, where Afn = {?^o, ni, . . The name n stands for a generic element of 

Mn- 

Definition 2.1 (syntax). Processes (denoted by P,Q, R, ... e V) and capa- 
bilities ( denoted by M, N,. . . ^ C) Processes are built according to the following 
syntax 

P ::= processes M ::= capabilities 

0 nil inn enter n 

M. P prefix in n allow enter n 

n[P] ambient out n exit n 

P I P parallel composition out n allow exit n 

{vn)P restriction openn open n 

\P replication open n allow open n 

In the following capabilities are ranged over by M, N . . ., actions by p and coac- 
tions by fb. Furthermore, capabilities and names are ranged over by a. Standard 
syntactical conventions are used: the trailing 0 in processes M . 0 is omitted, and 
parallel composition has the least syntactic precedence. 

We refer to the usual notions of names, free names, and bound names of 
a process P, denoted by n(P), fn(P), bn(P), respectively. Hereafter, it will be 
convenient to assume an external ambient, T ^ Af', so we let Af = Af' U{T}. 

The structural congruence = on processes is defined as the least congruence 
satisfying the following clauses: 

— [vni)P = {nnj)P{nj/ni}, if nj fn(P); 

— (7^/=, 1, 0) is a commutative monoid; 

— M.P = M.Q\iP = Q-, 

— n[P] = n[Q] \i P = Q-, 

— {vn)P = {vn)Q if P = Q; 

— (nn)0 = 0, {v n){v n')P = [v n'){v n)P , {vn)P = P ii n ^ fn(P), 

\vn){P I Q) = {vn)P \ Q ii n ^ fn(Q), 

m[ {i/n)P] = (vn)m[P] if n m; 

— IP = P\\P. 

The structural congruence is the standard one, apart from the treatment 
of a-conversion. A name ni G Afn the name can only be replaced by a name 
nj G Afn (not occurring free in the process). Moreover, we assume that the names 
occurring bound inside restrictions are all distinct from each other and from free 
variables. These assumptions are only used to keep the definition of our analysis 
more compact, and can be easily removed (see, e.g. [1]). 

The reduction relation — > of SA in Table 1 is defined in the usual way. The 
only differences with the reduction semantics of MA are the basic movement 
rules. We write for the reflexive and transitive closure of — >. 

We introduce some notions which are necessary to formalize the soundness 
of the analysis and also the security property. 
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Definition 2.2. Let P he a process and a ^ N DC a name or a capability. We 
say that a is 

1. top level in P iff a does not occur inside any ambient of P; 

2. enabled in P iff a is top level in P and it does not occur underneath a prefix. 

For instance, in P = infc.outn | m[outn. openm. in a], both capabilities infc 
and out n and ambient m are top level. Capability in k and ambient m are also 
enabled, while out n is only top level. Intuitively, if a is enabled, then it can 
be exercised (if it is a capability) and can interact with other processes at the 
same level (if it is an ambient). Hence, whenever P is placed inside an ambient 
with name n, capability in k may be exercised and cause a movement of n and 
ambient m may interact inside n. By contrast, capability out n is not ready to 
be exercised, because it is guarded by infc: it is top level but not enabled. 

We say that a context C[—] (a process with a hole) is flat iff the hole is 
enabled in C[—\. Also, we call Q configuration of P if P = C[Q], and we write 
T[P] when C is empty. 



3 Control Flow Analysis for SA 

Our analysis predicts for each ambient n which capabilities and ambients may 
be contained inside n at run-time, when it occurs within a selected ambient. 
The solution to the analysis of a process P is a function (f>, such that a typical 
element of <p{n) is where a is either a name or a capability and £ 1,^2 

are sets of names. Intuitively, when € fpin), the analysis says that the 

derivatives of P may include the following configurations: 

1. for each fc G ii, fc[i/p (P | n[<5])] where a is top level in Q; 
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2. for each k G £ 2 , k[up {R \ n[P]) ] where a is enabled in P. 

Before introducing the formal definitions we illustrate the above through a 
simple example. We omit brackets in singletons. 

Example 3.1. Consider the process 

P = n[ in fc. out n | m[ out n. openm. ina] ] | fc[ in fc. openm]. 

The process P evolves (only) as follows. The ambient n can go inside k 

P — > fc[ openm | n[out n \ m[out n. openm. ina] ] ] = Q 

When n is inside fc, the ambient m can exit from n 

Q — > fc[ openm | n[0] j m[ openm. ina] ] = i? 

Whenm is inside fc, it can be opened and it liberates the capability in o inside fc 

R — > fc[n[0] I ino] = S. 

A solution for the analysis of P is as follows 



<>(T) 




0(n) 




4>{m) 


{out , open , in } 


m 





— 4>{T) contains ambients n and fc with label (0, 0), to indicate that they occur 
at the outermost level. 

— 4>{n) contains: to predict that m is enabled when n is inside 

T or fc (see processes P and Q); in fc^'’'''^ to show that in fc is enabled, when 
n is inside T (see P) and will be consumed when n moves elsewhere (see Q 
and R)] out recording that outn is top level when n is inside T 

or fc (see P and Q), while it is only enabled when infc has been consumed, 
and thus n has moved inside fc (see Q). 
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— (f>{m) contains: out predicting that out n is enabled when m is inside 

n (see P and Q); showing that openm is top level when m 

is inside n oi k (see P and Q), while it is only enabled after the execution 
of out n, that is in k (see R). Similarly for 

— ip{k) contains: n and m with label (T, T) to indicate that both ambients may 

enter in k {n by exercing in k and m by exercing out n); ink and open m with 
label (T, T), showing that m does not move from T. Moreover, it contains 
also to predict that ina is top level and enabled when k 

is inside T (see S). 

Although, when e ^2 Q h (a enabled is also top level), both sets 
are useful for an accurate prediction of process behaviour: 

1. £i approximates the set of ambients that surround n when a is top level and 
that therefore may acquire a by opening n. For instance, in Example 3.1 the 
ambient k acquires capability ina from (p{m) through openm, because k 
belongs to (the first component of) its label. Capabilities out n and openm 
are not acquired, because they are not top level when m is inside k. 

2. £2 approximates the set of ambients that surround n when a is enabled, 
namely when a is ready to interact. Hence, £2 can be used to say if a capa- 
bility is executable and to predict its effect. For instance, in Example 3.1 the 
(second component of the) labels of out n in cj){n) and of out n in (p{m) are 
used to predict the effect of out n on the ambient m. Since out n is enabled 
when m is inside n and outn is enabled when n is inside k, m may go out 
of n and may end up inside k, the current parent ambient of n. 

NOTATION. In the following we shall make use of some notation and abbrevi- 
ations. The set of labels is C = {(^ 1 ,^ 2 ) | ^ 1,^2 e p(AA) A £2 C £i}- Hereafter, 
L will stand for (^ 1 ,^ 2 ) and Li will stand for {£i^i,£ 2 ,i)- Inclusion and union 
of labels are defined componentwise. Moreover, with an abuse of notation, for 
N e p(A/") we write L C N in place of L C [N, N) and L U in place of 
LU{N,N). 

The set of located capabilities and names is denoted LCA^ ranged over by , 
where a is either a capability or an ambient name and L is a label. An element 
/ is a subset of CCA. Over elements we define 

1. e / if e I and Li C L 2 ; 

2. Ii r I 2 iff G Ii implies £ P- 

For example {inn*-^A)| ly 

THE ANALYSIS. We now formalize the intuition given above. Our CFA does 
not distinguish among different elements of the same equivalence class Nn ■ This 
amounts to saying that i^(n) stands for the analysis of every n, and, similarly, 
each n occurring in a located capability stands for every Uj, as well. In this way, 
we statically maintain the identity of names that may be lost by applying the 
standard a-conversion: for a different approach see the naming environment and 
the stable names in [12]. 
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Definition 3.2 (Solutions). Given a process P, let n(_P) U {T} C Np. A so- 
lution for P is a total function (p : Np p(£CA). 

Furthermore, we order solutions by letting <pi Q (p 2 iff'^n G Np. (pi{n) Q cj> 2 {n). 

A solution <p for a process Q is validated if and only if f Q (shortly 

(p \= Q) according to the set of clauses in Table 2. Clauses operate on judgments 
of the form 

where fc is a name of an ambient or T, L is a label, and I is an element. The 
rules in Table 2 use some auxiliary notions that follow. 

AUXILIARY NOTIONS. Let p he a solution for P and n,k & Np. Some of the 
functions defined below depend on p, that we shall often omit when clear from 
the context. 

We introduce a function f that collects for each ambient n a superset of its 
possible fathers or parent ambients, i.e. the ambients that may surround n. In 
the same definition, we introduce the function enab which, given an ambient 
n and a capability M, collects those fathers of n in which M may be enabled. 
Moreover, we define a sort of transitive closure w.r.t. the open operation. 



Definition 3.3 (Fathers, ENAB,fc^). 

— f^(n) = {k\n^ e P{h)}; 

— ENAB'l’{M,n) = {£2 1 g p(n)j; 

— is the least set of names such that k k^j, and \/h G k^^, 
if n e EAAB'^(open h, h) and o-penh^ G p{n) then n G 

For instance in Example 3.1 ENAB(out n, n) = k and ENAB(infc, n) = T 
which show that out n may be exercised when n is inside k only, and that in k 
may be exercised when n is inside T only. 

Based on the above, the function t predicts the target ambients in which an 
ambient ends up after a movement action (in or out ). 



Definition 3.4 (Target). The target of a in k, f^(a, fc) C Np, is given by 



1. t’£(inn,k) 



n if U-C ENAB‘£{±nn,h) n ENAB‘£{±nn,n) ^ 9 
0 otherwise 



2. t'^(outn, fc) 

3 . t'^(a, fc) = 0 



EJVAB‘^(out n, n) if n e U/iefeT 
otherwise 
if a Q {openn, inn, out n, openn} 



EJVAB‘^(out n, h) 



The target of a movement capability is empty whenever the capability is 
deadlocked. In details: in n may be executed only if there exists an ambient h, 
such that in n is enabled when fc in inside h and in n is enabled when n is also 
inside fc; out n is executable only if it is enabled when fc is also inside n. When 
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a capability inn is executable, its effect is that of moving the ambient k inside 
n; when a capability out n is executable, its effect is that of moving the ambient 
k inside some of the fathers of n, the ones where out n is enabled. 

For instance in Example 3.1, t(infc,n) = k, since infc is enabled when n is 
inside T, while inn is enabled when n is inside T. Moreover, t (out n,m) = k 
since out n is enabled whenm is inside n and out n is enabled whenn is inside fc. 

We introduce also some operators acting on elements, i.e. on subsets of CCA. 
They are used in the clauses (open) and (par): we shall give some intuition 
later on. 



Definition 3.5. We define 

— I < k = {a^ \ £ I A k e £i}; 

I <1 k if k e enab'£ {open n,n) 



- j‘£{k,I,n) = 



otherwise 

- / © L = I a^' e I}; 

- h ®th= (/i©{t^(M,n) I eh}) U (/2®{t<^(M,n) | eh}). 

By abuse of notation, we write h ® h for h ® {h,h). 



Intuition on the CFA clauses The intuitive meaning of the elements of judge- 
ment (f) P, with L = (£1,^2), is the following: the process P to be examined 
is (statically) contained in the current ambient k; the element J, subset of 4 >{k), 
contains enough information to validate process P assuming the following con- 
figurations are reachable: 

1. for any m G £1, mlv’p {R \ fc[Q]], where P is a subprocess of Q (possibly 
equal to Q itself), and 

2. for any m e £2, m[i'p {R \ fc)©*] ]. 

Using subsets of (j){k) increases the accuracy of the analysis, especially in the 
case of parallel processes (see the explanation of clause (par) below). The clauses 
operate on the structure of process P, updating fc, L and /. The current ambient 
fc is updated, whenever we pass inside a new ambient. The current label L is 
updated in the same situation and whenever we pass a movement capability 
(input and output). Each time an ambient or a capability is under consideration, 
the analysis checks that the current element I contains the corresponding located 
component. The check skips to another element I' to analyse the continuation of 
the process in almost every clause, except for pref and pref. The new element 
V or a suitable modification of it must be part of the old subset I (and finally 
part of the analysis of the current ambient). We now illustrate the more relevant 
clauses. 

(amb) Ambient n is (statically) contained inside the ambient fc. Label L says 
that n may be enabled, when fc is inside an ambient of £2, while it may be top 
level but not necessarily enabled, when fc is inside an ambient of £\. 

Hence, the located ambient has to occur in /, i.e. in the solution for fc. 
The process P has to be validated in the ambient n with label (fc,fc), because 
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nil: p \='}P 0 


iff true 


amb: p \=^’^ n[P] 


iff E I A3P :P P A P r p{n) 


k L 

pre{:p\=j' jji.P /\ 7^ openn 


El A ^ ^fc.CiUt(M.fe).t©,fc)) p 

iff 

(Vh E t(/r, fc) : e p(^h) 


pref p JI. P 


iff e J A p h A P 



k L 

open: (p \=j’ openn. P 



iff I 



^ (openn-^ G /) A : (p |=j’ P A 

h ®k {l{k, n) © L) □ / A 

Vfc e ENAB(open n, n), h E Af, 

G 4>{h) : n E £i ^ k E ii, Vi = 1,2 



par: cp P I Q 



iff 



3h,h : <P P A xcp Q A 

h ©fe -^2 C / 



repl: p !-P 



iff 3P : P P A r ®kP 



Table 2. Control Flow Analysis for SA 
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the configuration k[vp {R \ n[P])] may be reachable. To analyse P it suffices 
then to find out an element V such that I' T 4>{n). 

(pref) As in the previous rule the located capability must belong to the 
element I. The residual process P has to be validated still inside the ambient 
k considering that ambient k, as effect of /r, may have moved. Indeed, function 
t(;U, k) gives the set of ambients in which k may enter because of the execution 
of p. Hence, the following configurations may be reachable: m[i/p {R \ fc[H])], 
for m G t(p,, fc), and m[v’p {R \ fc[Q])], where P is a subprocess of Q, for 
m G U t(p,, k). Label (£i U t(p,, k), t(p,, kj) is thus used to validate P. 

For instance in Example 3.1 we have out G 0 (n), because t( in fc, n) = 

k and 

(p in k. out n requires cp |=”d{T,fc},fc) 3 ^^ 

while 

(j> ^”’11^’''"}. fc) out n requires out G / H 

The remaining condition guarantees that t(/r, k) have been recorded as pos- 
sible fathers of k. 

(par) The main idea is that of validating separately the process P and the 
process Q and to combine the results through the operator (g). Consider the 
process R = inn \ out m. We obtain for instance 

(j) in n and <p out m 

for Ii = {inn^}, I2 = {outm^}. Neither p nor p contains enough information 
to validate R, because capabilities in n and out m may be executed in every 
order. Hence, the solution is sound only if it predicts that out m may be exercised 
in (2 and also after in n, i.e. when k is inside n. Symmetrically it predicts that 
inn may be exercised in £2 and also after out m, namely when k has moved out 
of m. Since label L does not contain this information, the labels of capabilities 
and ambients of the solutions p and p have to be updated according to the 
movements of the parallel processes. The operation ® derives from elements p 
and p a correct element for R. Assuming that t(inn, k) = n and t(out m, k) = 
i{m), we have 

(open) The effect of an openn exercised inside k is that of liberating the 
process (say Q), which is currently contained in n, and to place it in parallel 
with the continuation of openn (say P). Considering a simplified case where 
P = 0 we have that fc[openn | n[ openn. Q] ] reduces to k[Q], 

The idea is that the element I is valid iff also (f> Q, i.e. if I contains 
the ambients and capabilities which may be liberated inside k by opening n. If 

G (p{n) and k e £1, then the the following configurations may be reachable 
fc[t>'p(/? I n[S'])], where a is top level of S and may be acquired by k by opening 
n. Hence, the subset of p{n), (p{n) <1 k (that is the set of such that k G £1), 
gives an approximation of the ambients and capabilities that k may acquire. 
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To be more precise we require that </)(n) <i fc is a subset of the element I only 
when openn is executable, i.e. when openn is enabled when n is inside k (see 
Definition 3.5 of 7 ). 

For instance in Example 3.1 we have 

4>{m) = {out open in 

Therefore, 'y{k,(f){m),m) = only capability in a is acquired by k 

through openm. 

If openn has a continuation P then fc[openn. T* | n[ openn. Q ] ] reduces to 
k[P I Q]. Hence, the element y(fc, <()(n), n) has to be combined in parallel with 
the element that validates P. The remaining condition in the clause ensures 
that the effect of openn is properly recorded in all the labels. Since n may be 
dissolved inside fc, every capability or ambient which was enabled (or top level) 
in n is also enabled (or top level) in fc. 

Remark 3.6. Now, we can better comment the role of labels and £ 2 : label £1 
is used to handle more accurately the effect of openn (see rule (open)); label 
£2 restricts the cases where capabilities are executable and also allows one to 
predict more precisely where an ambient may end up after its execution (see 
Definition 3.4 of t). 

If capabilities and ambients were not labelled in our manner the prediction 
of the analysis would be less precise. For instance in Example 3.1, the analysis in 
[ 12 ] would imprecisely predict that fc may acquire by opening m also out n and 
openm, i.e. the whole set (p{m). Moreover, the same analysis would imprecisely 
predict that the ambient m may end up, by executing out n, in any father of n, 
thus in fc and also in T. 



4 Properties of the analysis 

We state some standard results of our analysis. First there always exists a least 
solution which is valid according to the clauses of Table 2. It is enough to show 
that the set of valid solutions is a complete lattice w.r.t. T (here defined com- 
ponentwise), and that it forms a Moore Family ^ [8,15]. 

Theorem 4.1. The set J = {4> \ (p P} ® Moore family. 

The analysis satisfies a standard Subject Reduction theorem: validity of so- 
lutions is preserved under reduction. 

Theorem 4.2. Let (p P, I Q <P{k) and L C f(fc). If P Q then there 
exists r such that (p Q and / r /' r <p{k). 

Corollary 4.3. Let (p P- If P =A Q then cp Q- 

^ (X, C) is a Moore family whenever any of its subsets admits a g.l.b. in T 
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The following theorem states the relation between the dynamic and static 
behaviour. It says that a valid solution for a process P contains all the configura- 
tions of any derivative of P. In details, for every configuration k[ vp{R 2 \ n[ ]) ] 
of any derivative of P, and for every a (capability or name) top level (resp. en- 
abled) in Ri, the solution for n contains where k e £2 (resp. k £ £ 1 ). 

Theorem 4.4 (Correctness). Let a £ AfuC and n,k & TV". Whenever P, 
P Q, and k[vp {n[Ri ] \ R 2 )] is a configuration of Q: 

1. if a is enabled in R\, then £ (^{n); 

2. if a is top level in R\, then g <p[n). 

5 A Security Property 

An ambient m may acquire the information contained in another ambient by 
opening it. We wish to maintain some information confidential to a group of 
ambients, considered trustworthy. To do so, names TV" (including T) are par- 
titioned into trustworthy T and untrustworthy lA. Secrecy of data is preserved 
if an untrustworthy ambient can never open a trustworthy one. Based on this 
partition, we first formalise our dynamic notion of secrecy: protection] then we 
give a static notion that implies the dynamic one. 

We say that n is opened by fc in P iff fc[ i/p (n[ ] | P 2 )] is a configuration 

of P and openn and openn are enabled in Pi and in R 2 , respectively. 

Definition 5.1 (Dynamic Property). A process P is protected iff^k £ Af, 
n T, whenever P => Q and n is opened by k in Q then k & T . 



Definition 5.2 (Static Property). A process P is defended if there exists a 
solution (p such that (p ^ P and, Vfc £ TV", n e T, whenever open £ (p{k), 
openn'^^ £ (p{n) and k £ £ 2 , 2 ; then k e T. 

Our static property is a correct approximation of the dynamic one. In fact, 
Theorem 4.4 suffices to prove the following. 

Lemma 5.3. If P is defended then it is protected. 

The property above is not enough, because it does not guarantee that a 
defended process P will still be such when plugged in a hostile context C[—\, 
unless the whole C[P] is analysed again and proved to be defended. However, we 
can characterize those contexts that do not break the secrecy of P and for which 
there is no need to analyse C[P]. Technically, we first define a tester process, 
called enemy, which represents the most hostile context in which P can be put. 
The enemy satisfies some mild conditions on the names of its ambients: they 
should not clash with a subset TV of those occurring in P. Note that the enemy 
may know the names of P, even if trustworthy: they need not to be kept secret 
(e.g. by restricting them). Also, the check on the names of the enemy can be 




Safe Ambients: Control Flow Analysis and Security 211 



done at run-time by _P: entering, leaving or opening an ambient n of the enemy 
should be forbidden if n £ iV. This can be easily done by a dynamic check. 

Then, we show that if the system where P and its enemy run in parallel is 
defended, then P is protected in any context represented by the enemy. 

Definition 5.4 (Enemy). The enemy of a process P w.r.t. N C fn(E) is 

E{P,N) = (lmgfn(p)\AT '^[Q]) \Q 

where Q =|„gfn(p) (!inn | loutn | lopenn | linn | loutn | lopen n). 



Theorem 5.5. If P \ E{P,N) is deiended for N C fn(T’) thenC[P] is protected 
for every flat context C[— ] with no configuration n[Q], n £ TV. 

Note that it is not necessary to actually analyse E{P, N): its solutions have 
all the same shape and can be easily built. Furthermore, given the standard 
solution for E{P, N) and a solution for P, it is possible to build a valid solution 
for P I E(P,N). 

6 An Example 

Consider an ambient a willing to send a message to an ambient b. Since the ambi- 
ent calculus has no primitives for remote communication, a message is delivered 
by enclosing it in an ambient that moves inside the receiver. This acquires the 
message by opening it. When the message is secret, it is essential to guarantee 
that no ambient can open the ambient carrying the message, except for the des- 
ignated receiver b. In other words, we would like our system to be protected (but 
we do not consider whether b receives the message or not). 

Assuming that the path from a to 6 is known, an abstract specification could 
be the following, where mail carries the message D 



SYS = a[outa. in a | openmail \ MAIL] \ 6[in6. open msp] 

M AIL = mail[owt a. in b. out mail, out b. in a. open mail 
msg[ out mail, open msp. D] ] 

The ambient mail goes out of a and carries within b the message, included in 
msg. When there, msg exits from mail and b reads D through open msp. Then, 
mail goes back to a, willing to deliver the ack of b through openmail. 

When the message is secret, it is essential to guarantee that no ambient can 
open msg and therefore read the message, except for the receiver b. Assume that 
T = {&, msg} is the set of trustworthy ambients and that all the other ambients 

^ For the sake of simplicity, we assume here that D is a passive datum, that may only 
occur in the same position of 0 in processes, and that requires no analysis. 
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(including) T, are trustworthy and form the complementary set lA. We wish to 
prove that SY S is protected (even when placed into a context which knows 
names msg, mail and b.) 

Actually, we show that P = SYS \ E{SYS, {b, msg, mail}) is defended. 
Then, Theorem 5.5 suffices to prove that C[S'y5] is protected in any context 
where there are no ambients with name b, msg and mail. 

The solution in Table 3 is valid for P and shows it protected. Indeed 4> satisfies 
the requirements of Definition 5.2: 

1. msg can be opened in b only because G (j){msg); 

2. ambient b is locked (it is not openable), because open 6 does not occur in 

4>{b). 



(j){mail) 


{out in out out ^(^u{b},wu{b})^ 

open ms U 


(j)(h) 


{Unew 


4>{msg) 


{out open ms 


0(n) 


U n ^ {a,T} 



Table 3. A Solution for SYS 



Now we can better explain why the restrictions on the use of names in the 
context are necessary. The constraint on the use of trustworthy names b and msg 
are necessary for both the dynamic and the static property to hold. Dynamically, 
a context may contain a trustworthy ambient, say msg, that accept to be opened 
inside any other ambients. From a static point of view, <() is a function and so 
ambients with the same name are mapped to the same element. So, there is 
no way to distinguish the located capabilities and ambients of the “good” and 
trustworthy ambient mail and the ones of the “bad ” and trustworthy ambient 
mail occurring in the context. 
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Note that labels are essential to prove (statically) that SYS satisfies the 
security property (see the discussion on the precision of the analysis contained 
in Remark 3.6). 

7 Conclusions 

The idea of using static techniques, such as those based on CFA [12,13,14] and 
type systems [10,4,3,5], to prove properties of mobile ambients is not new. The 
analysis of [12] is similar to ours but less accurate (if adapted to SA), since it does 
not use any contextual information (see Remark 3.6). The analyses contained in 
[13,14] show how to perform a rational reconstruction of the analysis [12] in the 
framework of Abstract Interpretation as well as how to specify stronger analyses 
using powerful counting analyses. Moreover, in [2] it is proposed an approach 
similar to our, even though based on type systems. 

Many research lines are still open, besides the use of shape analysis [14] for 
improving the precision of our analysis in the case of recursive processes. Among 
them, there is the problem of constructing solutions efficiently and of considering 
also communication. Furthermore, the generality of the analysis (which is not ad 
hoc for a specific property) suggests to define other static tests on the solutions 
for establishing other security properties, like for instance mobility and locking 
of [3,5] or integrity. 

It is worth mentioning that, even though SA was not introduced for security 
reasons, it seems to be more adequate than MA for better writing secure pro- 
grams. Coaction openn for instance offers a control on the access to the private 
resources of the ambient n. In the example of Section 6 a proper use of coactions 
guarantees that trustworthy ambients may be contained inside untrustworthy 
ambients without being opened. The protocol should be substantially compli- 
cated to achieve the same security property if MA is used. 

Acknowledgments. We wish to thank Flemming and Hanne Riis Nielson for many 
useful discussions. This work has been partially supported by MURST Project 
TOSCA. 
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Abstract. The Ambient Calculus and the Safe Ambient Calculus have 
been recently successfully proposed as models for the Web. They are 
based on the notions of ambient movement and ambient opening. Dif- 
ferent type disciplines have been devised for them in order to avoid un- 
wanted behaviours of processes. 

In the present paper we propose a type discipline for safe mobile ambi- 
ents which is essentially motivated by ensuring security properties. We 
associate security levels to ambients and we require that an ambient at 
security level s can only be traversed or opened by ambients at security 
level at least s. Since the movement and opening rights can be unrelated, 
we consider two partial orders between security levels. 

We also discuss some meaningful examples of use of our type discipline. 



1 Introduction 

The Ambient Calculus [4] has been recently successfully proposed as a model 
for the Web. An ambient is a named location: it may contain processes and 
sub-ambients. A process may: 

— communicate in an asynchronous way with a process in the same ambient; 

— cause the enclosing ambient to move inside or outside other ambients; 

— destroy the boundary of a sub-ambient, causing the contents of the sub- 
ambient to spill into the parent ambient. 

In order to have a richer algebraic theory, in the Safe Ambient Calculus [9] 
the activity of processes is better controlled since: 

— an ambient may traverse an ambient n only if at least one process inside n 
agrees; 

— a process may destroy the boundary of an ambient n only if at least one 
process inside n agrees. 

A standard way of forbidding unwanted behaviours is to impose a type dis- 
cipline. 

Different type disciplines have been proposed for the Ambient Calculus, tak- 
ing advantage of several papers on typing for mobile processes, with or without 
localities, see for example [11], [6] and [10]. In [5] the types assure the correctness 
of communications. The type system of [2] guarantees also that only ambients 

* Partially supported by MURST Cofin ’99 TOSCA Project. 
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which are declared as mobile will move and only ambients which are declared as 
openable will be opened. Adding subtyping allows us to obtain a more flexible 
type discipline [12]. Lastly, by means of group names [3], the type of an ambient 
n controls the set of ambients n may cross and the set of ambients n may open. 
Moreover the possibility of creating fresh group names gives a flexible way of 
statically preventing unwanted propagation of names. 

A powerful type discipline for the Safe Ambient Calculus has been devised 
in [9]. The main features are the control of ambient mobility and the removing 
of all grave interferences, i.e. of all non-deterministic choices between logically 
incompatible interactions. This is achieved by means of types which can be 
derived only for single-threaded ambients, i.e. ambients which at every step offer 
at most one interaction with external or internal ambients. 

The secure safe ambient calculus of [1] is a typed variant of Safe Ambients in 
which ambient types are protection domains expressing behavioural invariants. 

Security plays a crucial role in the theory and practice of distributed systems. 
In the present paper we propose a type discipline for safe mobile ambients which 
is essentially motivated by ensuring security properties. The type of an ambient 
name specifies a security level s. We require that an ambient at security level 
s can only be traversed or opened by ambients at security level at least s. For 
example the movement of an ambient n inside an ambient m: 

n[in m.Pi \ P2] \ m[in m.Qi \ Q2] 

can correctly be typed only if the security level of n is greater than or equal 
to that of m. Moreover, we consider also passive security levels, which forbid 
an ambient to influence the behaviour of a surrounding ambient belonging to a 
higher security level. 

Since the movement and opening rights can be unrelated, we consider two 
partial orders between security levels. 

As in the above-mentioned type discipline with group names, each ambient 
name belongs to a security level. However, we do not consider the possibility 
of creating fresh, private secnrity levels, since we consider two global partial 
orders defined on secnrity levels and it is not clear how to include the new level. 
Moreover, thanks to the order relations, in the type of an ambient name we do 
not need to list explicitly all security levels of ambients it may traverse or open. 

In the simple case in which the movement rights are all equal and there are 
only two opening rights we classify the ambients as trustworthy and untrustwor- 
thy. Then the types can assure the secrecy property considered in [8] that an 
untrustworthy ambient can never open a trustworthy one. 

The paper is organized as follows. Section 2 recalls the definitions of Mobile 
Safe Ambients. For the sake of readability, we present our type system in two 
steps. Section 3 discusses a simpler version, which introduces the notion of se- 
curity level in ambient types. The full system is motivated by the requirement 
of obtaining more refined typings, in particular to type as immobile an ambient 
which opens a mobile ambient that, when opened, does not unleash mobility 
capabilities. To do this we need to distinguish the behaviours of processes before 
and after an ambient is opened (Section 4). Section 5 gives some examples of 




Security Types for Mobile Safe Ambients 217 



use of our type discipline. A first protocol models a mailserver with different 
mailboxes and users: each user is allowed to enter only his own mailbox and this 
is achieved via type constraints imposed on the security level order. A second 
example shows that we can encode the security policy for reading and writing 
discussed in [7] for the 7r-calculus. Lastly, we present the renaming, firewall and 
channel protocols which are already typed in [2] and in [9] for comparisons and 
for showing how some behavioural conditions can be expressed in our system as 
type constraints. Some final remarks are done in Section 6. 

2 Untyped Mobile Safe Ambients 

The calculus of Mobile Safe Ambients [9] is a refinement of the calculus of Mo- 
bile Ambients [4] which allows a better control of the actions and therefore a 
richer algebraic theory of processes. We will use here the calculus of Mobile Safe 
Ambients, the only difference being that we describe infinite behaviours by \P 
instead of recX.P, since this slightly simplihes the typing rules. The syntax is 
given in Figure 1 starting from an inhnite countable set of names. 

Figure 2 contains the reduction relation which uses the structural congruence 
= . As customary, the structural congruence = is dehned as the minimal rehexive, 
transitive and symmetric relation which is a congruence and moreover: 

— satishes \P =\P \ P; 

— makes the operator | commutative, associative, with 0 as zero element; 

— allows to stretch the scopes of restrictions, to permute restrictions and to 
cancel a restriction followed only by 0; 

— satishes e.P = P and (M.M').P = M.M'.P. 



::= 


expression 


P,Q ::= 


process 


n 


name 


{u n : W)P 


restriction 


in M 


can inter into M 


0 


inactivity 


In M 


M allow enter 


P\Q 


parallel 


out M 


can exit out of M 


\P 


replication 


out M 


M allow exit 


M.P 


action 


open M 


can open M 


M[P] 


ambient 


open M 


M allow open 


(Mi...Mfc) 


output action 


e 


empty path 


(xi : Wi, ...,Xk 


: Wk).P input action 


M.M' 


path 







Fig. 1. Expressions and Processes 



3 Security Types 

The basic idea of our type system is to control at type level access and opening 
rights of ambients. Each ambient name has a type that represents its security 
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(R-in) n[in m.Pi \ P2] \ m[in m.Qi \ Q2] — > m[n[Pi | P2] \ Qi \ Q2] 

(R-out) m[n[out m.P\ \ P2] \ out m.Q\ \ Q2] — > n[P\ \ P2] \ m[Qi \ Q2] 

(R-open) open n.P \ n[open n.Qi \ Q2] — > P \ Qi \ Q2 

(R-I/O) I (m :Wi,...,nk:Wk).P P{ni ~Mi,...,nk := M^} 

(R-par) P — > Q ^ P \ R — > Q \ R 

(R-res) P — >Q ^ {un ■. W)P — > {un : W)Q 

(R-amb) P — > Q =► n{P] — > n[Q] 

(R-s) p' = pp-^QQ = Q' ^ p' 



Fig. 2. Reduction 



level: a process inside an ambient can perform actions according to the ambient 
security level. A security level defines mobility and opening rights of ambients 
which belong to. As in previous type systems for Mobile Ambients [5,2], the type 
of an ambient keeps information about the type of messages exchanged inside 
it, if the ambient can move and if it can be opened. Moreover, in the type of an 
ambient name, we consider also type annotations that say if the ambient can be 
traversed and if it can open other ambients. In Section 5, we successfully exploit 
security levels to model a mailserver where unauthorized accesses are forbidden 
as non-well typed processes (Subsection 5.1) and to model an information flow 
policy in an ambient encoding of yr-calculus channels. 

Types. Types are defined starting from an universe U of active seeurity levels 
and an universe U of passive security levels: each element s in Z// has a corre- 
sponding element s in U. The universe U comes equipped with two partial order 
relations, for mobility rights and <° for opening rights. These orders are 
mirrored in U. 

We denote by s, s',si, . . . elements of U, by s, s',si, . . . elements of U, and by 
S, S', Si , . . . subsets oiU UU. 

As usual [5] the exchange types are Shh when no exchange is allowed and 
tuples whose elements are either ambient types or capability types. 

An ambient type has the shape Sy[r^] where: 

— s is the security level to which the ambient belongs (similar to the group of 

[3]); 

— T is the type of exchanges the ambient allows within (as in [5]); 

— Y says if the ambient is locked (•) or unlocked (o) and only unlocked ambi- 
ents can be opened (as in [2]); 

— V says if the ambient can (o) or cannot (•) open another ambient which 
belongs to a security level lower or equal in the order <°; 

— Z says if the ambient can move (rv) or if it is immobile (V) (as in [2]); 

— U says if the ambient can (v->) or cannot (0) be traversed by another ambient 
which belongs to a security level greater or equal in the order <^. 

A capability type has the shape Cap[F] where F is an effect. 

An effect has the shape Si,S2,T, where T is the type of exchanges which 
can be unleashed by an open action (as in [3]). The set Si contains the security 
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levels of the ambients which can be traversed. The set S 2 contains the security 
levels of the ambients which can be opened. In Si and S 2 the levels are active or 
passive according to the ambient is in an action or in a co-action. 

The sets of type annotations and of types are respectively given in Figures 3 
and 4. 



Z := mobility annotation 

rv mobile 
y immobile 

U := traversing annotation 

<-> can be traversed 
© cannot be traversed 



V := opening annotation 

o can open 

• cannot open 

locking annotation 

o unlocked 

• locked 



Y ■- 



Fig. 3. Type annotations 



F 

W 



T 



effect 

Si,S 2 ,T moves and is traversed according to Si, open and 

can be opened according to S 2 , exchange T 

message type 

Sv[T[/] ambient name in security level s 

(contains processes whose effects agree with {Y, V, Z, U}) 
Cap[F] capability (unleashes F effects) 

exchange type 
Shh no exchange 

ITi X ... X Wk tuple exchange 



Fig. 4. Types 



Typing Rules. As usual an environment E associates names with ambient and 
capability types: 

E ::= 0\E,n: W. 

The domain of E (notation dom{E)) is defined by: 

^ ifF = 0 

dom{E) I dom{E') U {n} \iE = E',n:W 

The typing judgments are relative to a given universe lA of security levels and 
to a given environment E. There are six kinds of judgments: 

U, E ^ o good environment 

U,E ^ T good exchange type T 

U,E ^ F good effect F 

IA,E h W good message type W 

IA,E h M : W good expression M of message type W 

U,E ^ P : F good process P with effect F. 




220 



M. Dezani-Ciancaglini and I. Salvo 



Figure 5 contains the rules for deriving judgments of the four first kinds. These 
rules follow in a standard way the syntax of types given in Figure 4. 



Who 

U,Eho 
U,Eh Shh 
U,E\-T sew 



_ _ , U,EhW n^dom{E) 

(Empty Env) (E^v Formation) 



U,Ehsl[TS] 



(Shh) 

(Amb) 



U,E,n :W 'r o 
W, £ h ITi ... W, £ h Wfe 
U,EhWiX ...xWk 
U,E^ F 



■ (Prod) 



W, h Cap[F] 
U,EhT (J SiCWUW 



(Cap) 



■ (Effect) 



IA,E h Si, S 2 , r 
Fig. 5. Good Environments and Types 



The capabilities unleashed by opening an ambient n are the maximal capa- 
bilities consistent with respect to the type annotations and the security level in 
the type of n, therefore to give the typing rules for the open expression we need 
to define when two sets S,S' of security levels agree (ixi) with the type annotation 
{V, Z,U}. This means that: 

— if S contains an active security level then Z says that the ambient is mobile; 
~ if S contains a passive security level then U says that the ambient can be 

traversed; 

— if S' contains an active security level then V says that the ambient can open. 

This allows us to build the pair of maximal sets 9({E, Z, U}, s) which agree with 
a given type annotation {V, Z,U} and contain only a given security level (active) 
s and (passive) s. See Figure 6. 



3s G S Z =rv & 

S,S' xi{V,Z,U} ^ 3sgS ^ & 

3s 6 S' => V = 0 

9({F, Z, W}, s) = max S, S'.S U S' C {s, s} & S, S' ixi {V, Z, U} 

Fig. 6. Agreement between sets of security levels and type annotations 



To determine the effect of a path of expressions, we combine two effects for 
obtaining a new effect. As usual, we can combine two effects only if they share 
the same exchange type. The combination of effects is just componentwise set 
union, more precisely (overloading the symbol U): 
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(Si, S2, T) U (S3, S4, T) — Si U S3, S2 U S4, T. 

The typing of a mobility action, in M or out M, checks if the ambient M 
can be traversed and builds a simple capability by putting the active security 
level of M in the first set. The typing of a mobility co-action, in M or out M, 
is similar, except that the passive security level is put in the first set. 

The typing of an open action, open M, checks if the ambient M can be 
opened, using the information given by the type of M, that has the shape Sy[T§]. 
If Y is o, the ambient can be opened and in the type of open M we must take 
into account the effects unleashed by the ambient M after being opened. This 
effect must contain a pair of sets S, S' which agree with {V, Z, U}. Since we have 
to take into account the maximal possible effect, we build S, S' as ^({V, Z, U}, s). 
The final effect will be S, {s} U S',T, where we add s to take into account the 
open action. 

An open co-action, open M, can be typed only if the annotation Y in the 
type of M is o and the effect is simply { }, {s}, T. Figure 7 gives the typing rules 
for expressions. 

Typing an ambient requires to check if the ambient type agrees with the 
effect of the process running inside it. An effect S,S',T agrees with an ambient 
type sl.[T[f] iff: 

— T = T'; 

— the sets S, S' agree with the type annotations {V, Z, U}; 

— if S' contains a passive security level then Y says that the ambient can be 

opened; 

— all active security levels in S are s and all passive security levels in S are 

s; 

— all active security levels in S' are <° s and all passive security levels in S' 

are <° s. 

Conditions on passive security levels ensure that opening an ambient does not 
influence, unleashing co-capabilities, the behaviour of a surrounding ambient 
which belongs to a higher security level. 

In Figure 8 we formally define when an effect S,S',T agrees with Sy[T{f] 
using the definitions of Figure 6. 

Figure 9 gives the typing rules for processes. Rule (Proc Amb) checks that the 
effect of P agrees with the ambient type of n before building the process n[P]. 
Now n[P] has no action, and therefore its effect is {},{}, T for an arbitrary T. 
The remaining typing rules are almost standard. 

Properties of Well- Typed Processes. The effects we derive for processes are 
rather informative of their behaviours. More precisely iiU,E h P : S,S',T we 
can show that: 

— if P may perform a move action then S DU 0, 

— HP may perform a move co-action then S nU 0, 

— HP may perform an open action then S' C\U 0, 

— HP may perform an open co-action then S' DlA 0. 

It turns out that a process P has no thread [9], i.e. P cannot do actions, 
whenever we can derive the judgement U , E h P : S,S' ,T with (S U S') n W = 0. 
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U,E,n-.W,E' ^ o 

(Exp n) 

U,E,n-.W,E' ^ n-.W 



U,E'ro 

W,Ehe:{},{},r 



(Exp e) 



U,E^ M ■. CaplF] U,E\- M' : CaplE'] 

(Exp .) 

U,E\- M.M' : Cap[F U F'\ 

W,EhM:s^[T^] U,EhT' 

(Exp in ) 

h in M : Cap[{s}, { }, T'] 

W,EhM:s^[Tf] U,EhT' _ 

— (Exp in ) 

h in M : Cap[{s}, { }, T'] 

h M : s^[t£] U,E'rT' 

(Exp out ) 

W, E h out M : Cop[{s}, { }, T'] 

W,E h M : s^[T<5] U,EhT' 

(Exp out ) 

W, E h out M : Cap[{s}, { }, T'] 

U,Eh M ■,s°y[TS] 9({E,E,E},s) = S,S' 

(Exp open ) 

W, E h open M : Cap[S, {s} U S', T] 

U,Eh M :s°y[T^] 

(Exp open ) 

U, E \- open M : Cap[{ }, {s}, T] 



Fig. 7. Good Expressions 



T = T' & 

S,S' ixi {V,Z, U} & 

S,S'.Tixis^[T^^] ^ 3seS' ^ y = o& 

Vs', s' £ S.s' s & s' s & 
Vs', s' 6 S'.s' <“ s & s' <° s 



Fig. 8. Agreement between effects and ambient types 
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U,E\- M : Cap[F] U,Eh P-.F' 



■ (Proc Action) 



U,EV- M.P -.EDF' 
h M : s^[r#] U,E^P-.F Ft<sl[TS] U,EhT 



U,Eh M[P] :{},{}, T 



(Proc Amb) 



U,E,n: sI[TS] h P : F 
U,E'r{vn-. Sv[T(f])P : F 



U.E'rT 



(Proc Res) 



U,E^P:F U,E\-Q:F' 
U,E\- P\Q:FUF' 



(Proc Par) 



U,EhO-.{},{},T 
U,E\- P : F 



■ (Proc 0) 



U,EHP : F 



■ (Proc Repl) 



U,E,ni :lTi,...,nfe :lTfePP:S,S',lTi x . . . x ITfe 
W,Ph (ni : lTi,...,nfc : Wk).P : S, S', Wi x . . . x ITfe 
W, P h Ml : ITi ... W, P h Mfc : Wfe 



(Proc Input) 



W, P h (Ml, ..., Mfe) VPi x ... X ITfe 

Fig. 9. Good Processes 



(Proc Output) 



This implies that we can type only single threaded processes if we restrict rule 
(Proc Par) as follows: 

U,Eh P :F U,E^Q:F' st{F) or st{F') 

U,Eh P \ Q : FU F' 

where st(S,S',T) is short for (S U S') n W = 0. A more refined typing for single 
threaded processes is given in [9]. 

As usual, the soundness of the typing rules is assured by a Subject Reduction 
Theorem. Since the effects precisely describe the actions which can be performed 
by a process, and the reduction consumes actions, in general we can derive a 
“better” effect for the reduced process. To formally state this, we introduce an 
order r on effects: F'FF means that sets in F “cover” with respect to the 
appropriate order the corresponding sets in F'. We say that a set S' covers a set 
S with respect to the order <* (SP*S') iff for all s e S we can find s' e S' such 
that s <* s' and similarly for passive security levels. Figure 10 defines this order. 



Vs 6 S.3s' e S'.s <* s' & Vs € S.3s' 6 S'.s <* s' (* 6 {rv, o}) 
SiC'^Sj &S2E°S^ 



SC*S' 

Si,S2,rcsj,s(,,r 



Fig. 10. Partial order of effects 
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Theorem 1. IfU, E \- P : F and P — > Q then U,E \- Q : F' for some F' such 
that F'\-F. 

We will prove the soundness of the full system (Theorem 2) which implies 
that of the present one. 

The subject reduction property means that: 

— every communication is well-typed; 

— no locked ambient will ever be opened, and an unlocked ambient will only be 
opened inside an ambient which can open and which has a higher or equal 
security level in the order <°; 

— no immobile ambient will ever move, and a mobile ambient will only traverse 
ambients which can be traversed and which have lower or equal security levels 
in the order <^. 



4 Security Types with Opening Control 

As pointed out in [9] , better control over the ambient behaviours helps in deriving 
a richer algebraic theory and hence easier proofs of correctness. 

One important lack of the type system introduced in previous section, with 
respect of the type system of [9], is that it is not possible to type as immobile 
an ambient that opens mobile ambients. This is not satisfactory. As an example 
think to a message sent to a server: the message is a mobile entity, whereas the 
server is not. Reasonably, in an Ambient Calculus encoding, the ambient repre- 
senting the server has to open the ambient representing the message. A message 
is represented by a mobile ambient, but opening it does not unleash movement 
capabilities. We will further discuss this problem in the example concerning a 
firewall protocol, in Subsection 5.3. 

In this section we extend our type system in order to derive such kind of 
judgements. In the typical cases in which only one open appears in processes 
inside an ambient, exploiting the presence of co-actions, we can distinguish be- 
tween effects before and after the unique open co-action and hence determine 
the capabilities unleashed opening such an ambient. As expected, such control 
over ambient behaviour rather complicates the type system. 

Expanded Types. We extend the set of types as follows. The locking annota- 
tions can also have the shape: 

y = o{y, z, u} 

which means that the ambient can be opened and that processes spilled out after 
the opening behave according to the annotation {V,Z,U}. 

Also effects can keep information about opening and mobility rights required 
by processes before and after the opening of their surrounding ambient. Therefore 
effects can be simple effects that, as before, have the shape Si, S 2 , T or expanded 
effects, that have the shape Si,S 2 , {S 3 ,S 4 },T which says that there is only one 
open and that before the execution of it we have the effect Si, S 2 , T, while after 
the execution of it we have the effect Sa,S 4 ,T. In Fig. II we extend the syntax 
of effects. 
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H 



F 



opening effect 

S open and can be opened according to S 

Si, {S 2 , S 3 } open and can be opened according to Si 

after the open, moves and is traversed according to S 2 , 
open and can be opened according to S 3 

effect 

S,H,T moves and is traversed according to S, open and 
can be opened according to H, exchange T 



Fig. 11. Expanded Effects 



Expanded Typing Rules. The rules for type formation must be extended by 
adding a rule for the formation of expanded effects: 

U,E^T IJ S.CUUU S2nWyf0 

(Effect expanded) 

W,EhSi,S2,{S3,S4},T 

where we require that at least one ambient can be opened with the condition 

S2 n W 0. 

The combination of two effects is still componentwise set union when the 
first one is simple, but now the resulting effect is either simple or expanded 
according to how the second effect is. If the effects are both expanded their 
combination is a simple effect obtained by loosing the information on what is 
the difference between the effects before and after the two open actions: in such 
case we can not foresee, typing an open action, which open will be involved 
in the reduction, and hence the extra information given by expanded effects is 
not useful to obtain more refined typings. If the first effect is expanded and 
the second is simple, the combination is an expanded effect obtained by taking 
into account the second effect only after the unique open action. The formal 
definition of combination of effects is given in Figure 12. 






' SUS',HUH',T ifHCUUU 

S, Si, {S2 U S', S3 U H'}, T if R = Si,_{S2, S3} & 

< H'CUUU 

S U S' U S2 U S)., Si U S'l U S3 U S^, r if R = Si, {S2, S3} & 

R' = S'i,{S^,S^} 



Fig. 12. Combination of simple and expanded effects for expressions 



In typing rules for expression we have to modify the rule for expression path 
(Exp .), replacing the operator (jj to set union for combination of effects. The 
other rules remain the same, also if Y can be o{V, Z,U}, and we add rules for 
all actions to take into account expanded effects. The new rules for typing the 
mobility actions allow the movement after the ambient has been opened. 
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Similarly to the case of simple effects, the typing of an open action, open M, 
checks if the ambient M can be opened, using the information given by the type 
of M, that has the shape Sy[T§]. If F is o{I^', Z', U'}, the ambient can be opened 
and in the type of open M we must take into account the effects unleashed by 
the ambient M after being opened. This effect must contain a pair of sets S, S' 
which agree with {V , Z' , U'}, so as for simple effects we take 9({I^', Z' , U'}, s). 
The final effect will be S, {s} U S , T, where we add s to take into account the 
open action. 

An open co-action, open M, can be typed if the annotation Y in the type of M 
is Z' , U'}. In this case the effect is expanded to { }, {§}, {{ }, { }}, T. This 

allows us to distinguish between the effects before and after the opening of M. 
Figure 13 gives the typing rules for expressions. 



W, A h M : Cap[F] U,E\- M' : Cap[F'] 

(Exp . *) 

h M.M' : Cap[F{^F'] 

U,F^ M U,FhT' 

(Exp in *) 

h in M : Cap[{s},{},T'] 

U,F\- M U,FhT' _ 

^ (Exp in ★) 

h in M : Cap[{s},{},T'] 

U,Fh M U,F\-T' 

(Exp out *) 

W, E h out M : Cap[{s}, { }, T'] 

W,E h M : U,F\-T' 

(Exp out ★) 

W, E h out M : Cap[{s}, { }, T'] 

W,E h M : Q{{V',Z',U'},s) = S,S' 

(Exp open ★) 

U,F \- open M : Cap[S, {s} U T] 

W,E h M : 

(Exp open *) 

W, E h opm M : Cap[{ }, {s}, {{ }, { }}, T] 

Fig. 13. Good expressions with expanded effects 



We have also to define when an expanded effect agrees with an ambient type. 
An expanded effect Si,S2, {S3,S4},T agrees with an ambient type Sy[Tlf] iff: 

— T = T'; 

— Si, $2 agree with {V,Z,U}; 

— Y = o{V , Z' , U'} and S3, S4 agree with {V, Z' ^ U'}; 




Security Types for Mobile Safe Ambients 227 



— all active security levels in Si U S3 are s and all passive security levels 
in Si U S3 are s; 

— all active security levels in So U S4 are <° s and all passive security levels in 
S2 U S4 are <° s. 

See Figure 14 for the formal definition. 

Lastly, we need to combine the effects of two processes when they are put in 
parallel. If both effects are simple or expanded, we can use the [+) as defined in 
Figure 12 obtaining in both cases a simple effect. Otherwise we have that only 
one process contains an open co-action, so we can obtain an expanded effect 
in which the sets of the simple effect are considered both before and after the 
opening action. This definition is given in Figure 15. 

In the typing rule for processes we cannot combine expanded effects by means 
of set union as in rules (Proc Action) and (Proc Par). The rule (Proc Action 
*) uses the operator y of Figure 12, since the action M will be done before 
the actions in P. Instead the typing rule (Proc Par *) uses the operator y of 
Figure 15, since the actions in P and in Q will be done in parallel. Lastly rule 
(Proc Input *) takes into account that the effects of P can be either simple or 
expanded. The new rules are given in Figure 16. The remaining rules do not 
change, but we convene that all effects can be either simple or expanded. 



Si,S2,{S3,S4},Tc^s^[T(f] 



T = T'& 

Si,S 2 ^ {V,U, Z} Sz 
Y = o{V',U',Z'} & 

Ss,S4 IXI {V',U',Z'} & 

Vs', s' G Si U S 3 . s' s & s' s & 
Vs', s' G S 2 U Si.s' <° s & s' <° s 



Fig. 14. Agreement between expanded effects and ambient types 






' S U S', Si U H', {S2 U S', S3 U H'}, T ii H = Si,_{S2, S3} & 

H' CUUU_ 

< SuS',Si UH,{S2 US,Ss U 77 },T ifHCUUUSz 

H' = Si,{S2,S3> 

S,H,T\i)S',H',T' otherwise 



Fig. 15. Combination of simple and expanded effects for processes 



Properties of Well Typed Processes. All the observations done for the 
system of Section 3 remain valid proviso that: 

— if W, £1 h P : Si, S2, {S3 U S4}, T we replace Si U S3 to S and S2 U S4 to S'; 
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W,E,ni ■.Wx,...,nk.Wk'^ P x...xWk 

(Proc Input ★) 

(m : IPi,...,nfc : Wk).P : S, H,Wi x . . . x Wk 

U,E\- M : Cap[F] U,Eh P-.F' 

(Proc Action *) 

U,EV- M.P :F]^F' 

U,EhP-.F U,EhQ-.F' 

^ (Proc Par *) 

U,E^P\Q: FljjF' 

Fig. 16. Good Processes with Expanded Effects 



— we consider rule (Par Proc *) instead of rule (Par Proc) in the discussion 
about single-threaded ambients. 

Finally, to obtain a Soundness Result for the new type system we generalize 
the order relations to expanded effects as shown in Figure 17. 



S3,S4,TCS'i,S(,,{S^,Sa,r ^ & S4C°S1 

Si,S2,{S3,S4},rES'i,S(,,{S^,Sa,T ^ SiE^S' for i= 1,3 

& SiE°S' for i = 2,4 

Fig. 17. Partial order of simple and expanded effects 



Theorem 2. IfU, E \- P : F and P — > Q then U,E \- Q : F' for some F' such 
that F'EF. 

Proof. The proof is by induction on the definition of — >. 

We consider only rules (R-in) and (R-open). The case of (R-out) is similar 
to that of (R-in) and the other rules follow easily by induction. 

For rule (R-in) the key observation is that 

a F tx Sy[T^] and F'f—F then F' ^ Sy[T^]. 

This follows easily from the definitions of P>< (Figure 8) and E (Figure 10). The 
typing of the left-hand side contains the sub-derivations shown in Figure 18._The 
two applications of rule (Proc Amb) require respectively (({s}, { }, T') |+) Fi)|+)F2 
ixi Sy,' [T{ji ] and (({s},{ },T)yF()yF2 ixi Sy[T^]. Therefore we can deduce 
the same type for the right-hand side as shown in Figure 19, where the two 
applications of rule (Proc Amb) require respectively Fil+)F2 ixi Sy, [T[^ ] and 
Fi\SF'xsl.[T§]. 

For rule (R-open) notice that: 

if Si,S2,T[>^s^[T|] then Si, S2, TE^({F, t/}, s), T; 

if Si, S2, {S3, S4}, T X then S3, S4, TEA({F', Z', U'}, s), T. 
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U,E^ m-.sX,[T§\ U,E^T' 

U,Ehinm-.Cap[{s},{},T'\ U,E^Px'.Ei 

U.E'rmm.Pi : ({s},{ },T')yTi W, £ h P2 : Ta 

W,Phinm.Pi I P 2 : (({s}, { }, T') (jj Pi)ljJP2 
W,Phn:s;^/[Tyf] W,Pb inm.Pi I P 2 : (({s},{ },P')1+JJ’i)1+J-F’2 U,E^T” 
W,Phn[inm.Pi | P 2 ] :{},{}, T" 

W,Phm:s^[T#] U,E\-T 
U,E^T^m: Cap[{s}, { }, T] W, P h Qi : F{ 

W,Ph h^m.Qi : ({§},{ },T)y Pi' W,Ph (52 : P 2 

W, P h hT m.Qi I Q2 : (({s}, { }, P) y Pi')(h^^ 
W,Phm:s^[T#] W,PhhTm.Qi I (52 : (({s},{ },T)yPi')yP:^ U,E^T" 
U,E^ m[hT m.Qi | (52] : { }, { }, T" 

Fig. 18. Typing of the left-hand side of rule (R-in) 



U,E^ Pi: El W, P h P2 : P2 

U.E'rn: [Tif,'] W, P h Pi | Pa : PiljJPa U,E^T 

U,E^n[Pi |P2]:{},{},T 

h(,E \- Qi : E[ U, E \- Q2 : P2 
W,P h n[Pi I Pa] : { }, { },P W,P h Qi | ( 5 a : Pi'ljJPa 
W, P h n[Pi I Pa] \Q1\Q2: Pi'ljJPa 

U,E'rm: s^JP#] W, P h n[Pi ] Pa] \ Qi \ Q2 : Pi'^Pa W, P b P" 
W,Phm[nlPi]Pa]]Qi]Q 2 ]:{},{},P" 



Fig. 19. Typing of the right-hand side of rule (R-in) 
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This is a consequence of the definitions of [xi (Figure 8 ), C (Figure 10), and 
9( , ) (Figure 6 ). Figure 20 shows the typing of the left-hand side, under the 
assumption that Y = o{V', Z', U'} and 9({F', Z', U'},s) — S,S' (the case Y = 
o and 9({F, Z, [/}, s) = S,S' is similar and simpler). The application of rule 
(Proc Amb) requires (({ },{s},{{ }, { }}, T) (+) Fi)l+)F 2 ixi Sy[T^], therefore 
(({ {s}) {{ { }}^T) l+)T'i)l+)F 2 must be an expanded effect. Let 

(({ },{ }},T)l+|Fi)i+iF2 =Si,S2,{S3,S4},T. 

Then by the definitions of 1+J (Figure 12), of 1+J (Figure 15), and of C we get 
^"il±)-f 2 !^S 3 , $ 4 , T. From the above remark Si, S 2 , {S 3 , S 4 }, T ixi Sy[T^] implies 
S 3 , S 4 , res. S', T. We obtain A 1 I+JF 2 CS, S', T and also A|+)(Ai|+)F 2 )C(S, {s} U 
S',T)\+jF being 1+) commutative. We are done, since we can derive Al+)(Fil+)F 2 ) 
for the right-hand side as shown in Figure 21. 



W, A h n : s))[T#] U,E\-T 
U,E open n : Cap[S, {s} U S', T] U, E ^ P : F 
U,Eh open n.P : (S, {s} U S', T) (jj F 

U,E^n-.sl[T§] U,E\-T 
W,Fhop^n:Cap[{},{s},{{ },{}},T] U,EhQ^ : Fi 

W,Fhop^n.(3i : ({ },{s},{{ },{ }},T)(jjFi W, F h Qa : Fa 

W, F h opm n.Qi I Qa : (({ }, {s}, {{ }, { }}, T) (jj Fi)ljJ^ 
U,E^n-.sl[TS] W,Fhop^n.Qi I Qa :(({}, {s}, {{},{}}, r)yFi)yFa W,FhF 
U,EV- n[open n.Qi | Qa] :{},{}, F 

U,E \- open n.P : (S, {s} U S', T) F U,E n[open n.Qi | Qa] :{},{}, F 
U,E \- open n.P \ n[open n.Qi j Qa] : (S, {s} U S', F) y F 

Fig. 20. Typing of the left-hand side of rule (R-open) 



5 Examples 

In this section we present some examples, where we assume a different security 
level for each ambient. Examples in Subsection 5.2, 5.3 and 5.4 are discussed 
both in [9] and [2]: we present them in order to show which kinds of typing 
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lA^ E \- Qi : F\ E \- Q 2 '■ F 2 
U,E'rP-.F U,E'tQ^\Q2:Ft\^F2 
W,£hP|Qi I Q2 : p(jJ(Pi(jJP2) 

Fig. 21. Typing of the right-hand side of rule (R-open) 



are derivable in our type system and which kinds of constraints on the security 
levels order are required to type them. The other two examples, instead, show 
possible applications of our type system, in particular how to use it to ensure 
some security properties. Examples 5.2, 5.4 and 5.5 can be typed in the simpler 
system of Section 3 while the others examples require the extension discussed in 
Section 4. 

5.1 Mailserver 

In this example we model a mailserver. In our model, a mailserver MS is an 
ambient which contains a set of mailboxes, MBOXi, for I < i < fc. A message 
MSG is a mobile ambient that enters first in the mailserver and then in a mailbox. 
Messages have high level mobility rights, whereas each user must have the same 
mobility rights of its mailbox. The idea is that users do not have rights to enter 
mailboxes of other users. A user Ui that tries to use a capability in mbj, for 
i ^ j is not a well-typed process. 

MSG = m[in ms. in mbi.open m.{M)\ 

MS = ms[!in ms |!ont ms \ MBOXi | . . . | MBOXk] 

MBOXi = mbi[\in mbi |!out mbi] 

Ui = Ui[in ms. in mboXi. open m.{x : T).ont m6j.ont ms.P] 

where T is the exchange type of P. 

Considering an environment E such that: 

U^E V- ms : mserv^[T^] 

U,E ^ mbi : mboXi|-[T^] 

U,E^ u^\ uti^[T^] 

W, A h M : T 

we can type processes inside ambients as follows: 

W, £ h in ms. in mbi.open m.{M) : {mserv, mbox^}, {msg}, {0, 0}, T 
U,E ^ !in ms |!out ms \ MBOXi | . . . | MBOXj. : {mserv}, {}, T 
U,E V- !in mbi |!out mbi ■ {mboXi}, {}, T 

U,E 'r in ms. in mbi.open m.{x : T).out mbi.out ms.P : {mserv, mbox^}, {msg}, T 

It easy to check that process types agree with corresponding ambient types. In 
order to avoid unauthorized access to mailboxes, we consider the following order 
defined on U\ 

msg utj mboXi for all i 
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and 



uti utj & utj uti for all i ^ j 
uti >° msg for all i. 



5.2 Renaming 

We recall the construct given in [9] to change the name of an ambient: 

n be m.P = m[out n.in m.open n.P] \ out n.in m.open n 

Let E be an environment such that: 

U,E^m: sm^[T^] 

U,E^n: sna[r:^] 

where Y, U and T depend on the type of P. We assume that in the environment 
E we can derive the following typing judgments for P and Q\ 

U,EV P : Sf,Sf,T 
U,EhQ: Sf,S^,T 

The desired property is that^: 

n[n be m.P \ Q] ~ m[P \ Q] 

for all Q such that no ambient in Q can perform an out n action. In our type 
system, this last condition can be assumed by requiring that sn ^ Sf . 

In such an environment we can derive the following typing for processes: 

U,E h out n.in m.open n.P : 

({sn,sfh, sh},{sn,sh},T)l+)(Sf,S|',T) 

U,E ^ out n.in m.open n : {stT, sm}, {stT}, T 

Finally we observe that we can type the renaming protocol under the following 
conditions on the security levels order: 

sm >° sn 
sm sn 

In particular, it seems meaningful that security levels sm and sn are the same. 

5.3 Firewall 

The untyped protocol given in [9] for controlling access through a firewall is: 

AG = m[in m.open k. 

(x)x.open m.Q] 

FW = ((/ui)(w[out w.in m.open m.P 

fc[ont m.in m.open fc.(in ^)] ]) 

FW represents the firewall and AG a trusted agent which crosses the firewall 
thanks to the pilot ambient k. The name k plays the role of a key which allows 
only authorized agents to cross the firewall. 



^ As in [9] two processes are ~-equi valent iff no closing contexts can tell them apart. 
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Let P and Q be such that: 

W,£hQ:{},S?,T 
U,Eh P-.{],StT 

where T = Cap[{fw}, { },T'] for some T' and we assume that processes P and 
Q do not perform move actions to show that in this case we can type w as 
immobile. Let E be an environment such that: 

U,E^w. fw^[T^] _ 

U,Ehk: 

Let Pm (resp. P^ and Pk) be the process inside the ambient m (resp. w and k). 
In the environment E, we can derive the following judgments: 

U,Eh Pm - {_fw,ag},{aut,ag,{{ },S^}},T 
U,E^P^: {fw}, {ag}^ 

U,E\- Pk : {fw, ag}, {aut}, T 

and we obtain the following typed versions of the protocols: 

AG = m[in m.open k. 

(x : Cap[{fw}, { }, T'Jjx.open m.Q] 

FW = {uw : fw^[T^])(u;[out w.in ui.open m.P 
I fc[out ui.in m.open fc.(in «^)]]) 

We can type the ambient w as immobile, since opening the mobile ambient m 
does not unleash mobility capabilities. We obtain this information from the type 
of m and the possibility to type Pm with an expanded effect. The presence of 
the co-action open m is crucial for having such control over the behaviour of an 
ambient. 

The demanded property is that: 

{iym){AG | FW) ~ {i^m,w){w[P \ Q]) 

under the conditions w ^ fn{Q) and m ^ fn{P). These conditions are assured 
by fw ^ and ag ^ S^. We do not see how to express this requirement using 
only types in the system of [9]. 

Finally, we observe that the following constraints on the order relation among 
security levels are needed to correctly type the whole protocol: 

fw ag aut 

and 

fw >° ag >° aut 

5.4 Channels 

The encoding of asynchronous typed yr-calculus is an important test of expres- 
siveness for the ambient calculus. Several encodings have been proposed for both 
typed and untyped mobile ambients and for safe ambients. Here we consider the 
typed version in our system of the simpler untyped protocol presented in [9]: 




234 



M. Dezani-Ciancaglini and I. Salvo 



|p(gi, . . . ,g„)l =p[inp.openp.(gi,...,g„).openp.|P]l | openp 
I p(gi, . . . , g„)l = p[in p.open p.{qi,..., g„)] 

l{i^p:T)Pl = ii.p:lTmPl 
IP\QI = IPI\IQI 
|!P1 = ![P1 
[01 =0 

[ Ch(Tu . . . , T„)l = Pro4[[ Til X . . . X [ T„l-] 

Ip:Tl =p:in 
ir,p:n = iriip:Ti 

where the type grammar is T := Ch(Ti, . . . , Tn), n > 0. 

li r \- P can be derived using the standard rules for the asynchronous typed 
TT-calculus [9] then we get either |/"1 h |Pl : { },{Proc}, Shh or [/"! h |i^l : 
{},{},Shh. 

5.5 Secure Channels 

This example is inspired by the information flow analysis carried out in [7]. We 
want to model a 7r-calculus, in which to each channel p is associated a write and 
a read access right. Moreover, a 7r-calculus process can be defined as running 
at a given security level^. According with the principle that information can 
move only upward, the security policy imposes that a channel can communicate 
information only to yr-calculus processes with a security level greater than its 
write access right and can receive information only from 7r-calculus processes 
with a security level less than its read access right. In the yr-calculus process: 

o-{p(g)} I p{p{q)P} 

we have that the process ^{^(g)} runs at the security level a and the process 
p{p{q)P} runs at the security level p. Communication can take place if the write 
access rights of p, Wp, are less or equal to p and if the read access rights of p, 
Rp, are greater or equal to a. 

We show an encoding of yr-calculus channels with read and write access rights 
into Safe Ambients. The basic idea is to modify the above encoding of channels 
in Subsection 5.4 in such a way: 

— the order <° among security levels in ambient types represents the order 
among yr-calculus security levels; 

— each channel p is represented as a pair of ambients p™ and p*"; such ambients 
belong to security levels Wp and respectively; 

— the security levels Wp and rp play in the ambient encoding of secure channels 
the role of access rights Wp and Rp] 

— in the encoding of a yr-calculus process p{P}, an ambient p, belonging to a 
security level Sp, represents the security level of a process; 

^ “security level” is overloaded in this subsection, since we consider both security 
levels of yr-calculus processes and security level of ambient types. No ambiguity 
arises thanks to the context. 
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— in the encoding of the yr-calculus process cr{p{q)] \ p{p{q)P}, the communi- 
cation protocol imposes that the ambient p must be able to open p™ , and p’’ 
must be able to open <j, so that the protocol is well-typed only if: 

Sp >° Wp 

S(T ^ 

The encoding is rather complicated, because a process does not know the security 
level of the process with which communication will happen and hence we need 
two auxiliary ambients p™ and p”. This implies that also the channel q will be 
represented by four ambients, g*" , , g*” , g” • 

I o'{p(g)}]] = cr[in p’’.out a.open a 

I p’”[out cr.open p™.open a] 

I p“[in p™.open p”.op^p’".(g’’, g™, g*”, g”)] 

] 

I p{p{x)P}} = p'’[in p’’.open p’”.out p'’.in p.open p'’.(x'’, x*”, x”).| -Pi 

I p" [in p™ .open p" . in p] 

I p[in p.open p™. out p’’.in p.open p] 

] 

I open p'' open p 

We have that: 

|p{p(x)P}l I Ia{p(g)}l IPMK := g^x“ := g“,x™ := g™,x" := g"} 

Let |C/i(T)l =w^[[rl2] X r£[lTl2] X w^[|Tl2] x r^[lTl2], where w (resp. 
r) is the write (resp. read) access right of a channel of type T and suppose the 
channel g has type T in the typed yr-calculus. We can type the above protocol 
in an environment E such that: 

U,E^a-.s,±[lCh{T)i':X] 

U,Eh p-.Sp°-[lCh{T)\':X\ 

W,Php“,p” :Wpl[IC/r(r)l-] 

U,Ehp-,p^ : rp°-[lCh{T)j2] 
i^,Phg-,g":w,°[[C/i(T)la] 
W,^;hg^g-:r,aCMT)l2] 



6 Conclusion 

We presented two type systems for safe ambients that guarantee security proper- 
ties. While the first one could be adapted without problems to mobile ambients, 
the second one requires co-actions in order to distinguish the effects before and 
after a (unique!) open action. 

Both type systems describe in a rather precise way the behaviours of ambients 
and could be used to establish a typed equivalence theory of ambients, similarly 
to what was done for the yr-calculus in [11]. 

The typing rules are syntax directed and therefore there is a type checking 
algorithm quite similar to that of [12]. It is not clear to us what means type 
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inference for systems which look for security properties, since these are just 
expressed by the environments and by the types of bound names. 

Acknowledgment. The authors are grateful to the referees for their helpful 
comments. 
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Abstract. Modern multi-paradigm declarative languages integrate fea- 
tures from functional, logic, and concurrent programming. Since pro- 
grams in these languages make extensive use of list-processing functions, 
we consider of much interest the development of list-processing opti- 
mization techniques. In this work, we consider the adaptation of the 
well-known difference-lists transformation from the logic programming 
paradigm to our integrated setting. Unfortunately, the use of difference- 
lists is impractical due to the absence of non-strict equality in lazy (call- 
by-name) languages. Despite all, we have developed a novel, stepwise 
transformation which achieves a similar effect over functional logic pro- 
grams. We also show a simple and practical approach to incorporate the 
optimization into a real compiler. Finally, we have conducted a number 
of experiments which show the practicality of our proposal. 

Keywords: functional logic programming, program transformation, com- 
piler optimization 



1 Introduction 

In recent years, several proposals have been made to amalgamate functional 
and logic programming languages. These multi-paradigm languages combine fea- 
tures from functional programming (nested expressions, lazy evaluation), logic 
programming (logical variables, partial data structures), and concurrent pro- 
gramming (concurrent evaluation of constraints with synchronization on logical 
variables). The operational semantics of modern multi-paradigm languages is 
based on needed narrowing, which is currently the best narrowing strategy for 
lazy functional logic programs due to its optimality properties [4]. Needed nar- 
rowing provides completeness in the sense of logic programming (computation 
of all solutions) as well as functional programming (computation of values), and 
it can be efficiently implemented by pattern matching and unification. 

* This work has been partially supported by CICYT TIC 98-0445-C03-01, by 
Accion Integrada hispano-alemana HA1997-0073, by the German Research Coun- 
cil (DFG) under grant Ha 2457/1-1., and by the Generalitat Valenciana under grant 
FSTBFCOO- 14-32 
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Example 1. Consider the function isShorter which is defined by the equations: 

isShorter([], ys) = True 

isShorter(x : xs, []) = False 

isShorter(x : xs, y : ys) = isShorter(xs, ys) 

where “[]” and are the constructors of lists. The expression isShorter(x : 
xs,z) can be evaluated, for instance, by instantiating z to (y:ys) to apply the 
third equation, followed by the instantiation of xs to [] to apply the first equation: 

isShorter(x : xs, z) '^{z,_^y:ysj isShorter(xs, ys) True 

In general, given a term like isShorter(li, I 2 ), it is always necessary to evaluate 
li (to some head normal form) since all three equations in Example 1 have a non- 
variable hrst argument. On the other hand, the evaluation of I 2 is only needed if 
li is of the form (_:_). Thus, if li is a free variable, needed narrowing instantiates 
it to a constructor term, here [] or (_:_). Depending on this instantiation, either 
the hrst equation is applied or the second argument I 2 is evaluated. 

Since functional (logic) programmers make extensive use of list-processing 
functions, we consider of much interest the development of list-processing op- 
timization techniques. In this work, we consider a well-known list-processing 
optimization from the logic programming community. Most Prolog program- 
mers know how to use difference-lists to improve the efficiency of list-processing 
programs signihcatively. Informally, a difference-list is a pair of lists whose sec- 
ond component is a suffix of the hrst. For example, the list 1:2:[] is encoded as 
a pair (l:2:xs,xs), where xs is a logical variable. The key to succeed in opti- 
mizing programs by difference-lists is the use of a constant-time concatenation: 
append((x, y) , (y, z) , (x, z)). Unfortunately, if we try to adapt this technique to 
a functional logic context, we hnd several problems. In particular, a common 
restriction in lazy functional logic languages is to require left-linear rules, i.e., 
the left-hand sides of the rules cannot contain several occurrences of the same 
variable. In principle, this restriction does not permit the encoding of concate- 
nation of difference-lists as a rule of the form: append>i=((x, y) , (y, z)) = (x, z) 
and; consequently, prevents us from having difference-lists in lazy functional lan- 
guages (at least, at runtime). Therefore, we are interested in a transformation 
process in which the hnal program does not contain occurrences of difference- 
lists. To achieve this goal we considered that, in some cases, programs using 
difference-lists are structurally similar to programs written using “accumulating 
parameters” [13]. Compare, for instance, an optimized version of quicksort by 
difference-lists (see Sect. 4): 

qs*([], (ys,ys)). 

qs*(x:xs, (ys, ys')) : - split(x, xs, 1, r), qs*(l, (ys, x:w)), qs*(r, (w, ys')). 
and by accumulating parameters: 

qsacc(D,ys,ys). 

qsacc(x:xs,ys',ys) : - split(x, xs, 1, r), qSacc(r, ys', w), qSacc(l, x:w, ys). 
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We will show that this idea can be generalized, giving rise to an optimization 
technique which achieves a similar effect over functional logic programs and 
which always returns a program without difference-lists. 

The structure of the paper is as follows. After some preliminary definitions 
in the next section, Sect. 3 describes the language syntax and the operational 
semantics referenced in our approach. Section 4 introduces a transformation 
technique (based on the use of difference-lists) which improves a certain class 
of list-processing programs and shows its correctness and effectiveness. An ex- 
perimental evaluation of our optimization is shown in Sect. 5. Finally, Sect. 6 
presents some related work and Sect. 7 concludes. An extended version of this 
paper can be found in [1]. 



2 Preliminaries 

In this section we recall some basic notions from term rewriting [5] and functional 
logic programming [7]. We consider a {many- sorted) signature S partitioned into 
a set C of constructors and a set T of (defined) functions or operations. There 
is at least one sort Bool containing the constructors True and False. The set 
of constructor terms with variables (e.g., x,y,z) is obtained by using symbols 
from C and X. The set of variables occurring in a term t is denoted by Var{t). 
A term t is ground if Var{t) = 0. A term is linear if it does not contain multiple 
occurrences of one variable. We write 'od for the list of objects oi, . . . , o„. 

A pattern is a term of the form f{dn) where f /n & iF and d\,. . . ,dn are 
constructor terms. A term is operation-rooted if it has an operation symbol at 
the root. A position p in a term t is represented by a sequence of natural numbers 
{A denotes the empty sequence, i.e., the root position). t\p denotes the subterm 
of t at position p, and t[s]p denotes the result of replacing the subterm t\p by the 
term s (see [5] for details). 

We denote by {x\ ^ ti, . . . ^ the substitution a with a-{xi) = ti for 

i = 1, . . . , n (with Xi yf Xj if z yf j), and (j{x) = x for all other variables x. The 
set T>om{a) = {x £ X \ a{x) y^ x} is called the domain of a. A substitution 
a is (ground) constructor, if cr{x) is (ground) constructor for all x G T>om{a). 
The identity substitution is denoted by id. Given a substitution 9 and a set 
of variables V C X, we denote by 6^v the substitution obtained from 6 by 
restricting its domain to V. We write 9 = a [V] if 9\v = and 9 < a \V] 
denotes the existence of a substitution 7 such that j o9 = a [G]. 

A set of rewrite rules I = r such that I ^ X, and Var{r) C Var{l) is called 
a term rewriting system (TRS). The terms I and r are called the left-hand side 
and the right-hand side of the rule, respectively. A TRS TZ is left-linear if I is 
linear for alH = r G 77.. A TRS is constructor-based (CB) if each left-hand side 
is a pattern. A rewrite step is an application of a rewrite rule to a term, i.e., 
t ~^p,R s if there is a position p in t, a rewrite rule R = {I = r) and a substitution 
a with t\p = a{l) and s = t[a{r)]p. In the following, a functional logic program 
is a left-linear CB-TRS. 
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To evaluate terms containing variables, narrowing non-deterministically in- 
stantiates the variables so that a rewrite step is possible. Formally, t '^p,R,a t' 
is a narrowing step if p is a non-variable position in t and a{t) -^p,R t' . We 
denote by to a sequence of narrowing steps to in with 

(7 = cr„ o o (7i (if n = 0 then a = id). Due to the presence of free variables, an 
expression may be reduced to different values after instantiating free variables 
to different terms. In functional programming, one is interested in the computed 
value whereas logic programming emphasizes the different bindings {answer). 
In our integrated setting, given a narrowing derivation t d to a constructor 
term d (possibly with variables), we say that d is the computed value and a is 
the computed answer for t. 

3 The Language 

Modern functional logic languages are based on needed narrowing and induc- 
tively sequential programs. Needed narrowing [4] is currently the best known 
narrowing strategy due to its optimality properties w.r.t. the length of success- 
ful derivations and the number of computed solutions. It extends the Huet and 
Levy’s notion of a needed reduction [10]. The definition of inductively sequen- 
tial programs and the needed narrowing strategy is based on the notion of a 
definitional tree [3]. Roughly speaking, a definitional tree for a function symbol 
/ is a tree whose leaves contain all (and only) the rules used to define / and 
the inner nodes contain information to guide the pattern matching during the 
evaluation of expressions. Each inner node has a pattern and a variable position 
in this pattern (the inductive position) which is further refined in the patterns 
of its immediate children by using different constructor symbols. The pattern of 
the root node is simply /(x^), where are different variables. Formally, given 
a program TZ, a definitional tree V with pattern tt is an expression of the form: 

V = rule{n = r') where tt = r' is a variant of a program rule I = r E Tl. 

V = branch{TT,p,Vi, . . . ,Vn) where p is a variable position of tt (called the 

inductive position), ci, . . . ,c„ are different constructors for n > 0, and each 

Vi is a definitional tree with pattern 7r[ci(xi, . . . ,Xk)]p where k is the arity 

of Ci and x\, . . . ,Xk are new variables. 

A graphic representation of definitional trees, where each inner node is marked 
with a pattern, the inductive position in branches is surrounded by a box, and 
the leaves contain the corresponding rules is often used to illustrate this notion 
(see, e.g., the definitional tree for the function isShorter of Example 1 in Fig. 1, 
here abbreviated as sh). 

A defined function is called inductively sequential if it has a definitional tree. 
A rewrite system TZ is called inductively sequential if all its defined functions are 
inductively sequential. Note that inductively sequential programs are a particular 
case of left-linear CB-TRSs. 

In order to compute needed narrowing steps for an operation-rooted term 
t, we take a definitional tree V for the root of t and compute X{t,V). Here, A 
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sh(x : xs, []) = False sh(x : xs, y:ys) = sh(xs, ys) 
Fig. 1. Definitional tree for isShorter 



is a narrowing strategy which returns triples (p, cr) containing a position, a 
rule, and a substitution. Formally, if t is an operation-rooted term and "P is a 
definitional tree with pattern tt, tt <t, then \{t,V) is defined as follows [4]: 



' {A,l = r, id) if P = rule{l = r); 

(q,l = r,cr o t ) ii V = branch{TT,p, Vi , . . . , Vk), = x £ X, 

T = {x^ Ci(^)}, pattern{Vi) = 7r[ci(^)]p 
and {q,l = r,a) e A(r(t),Pj); 



X{t,V)3{ 



{q,i 



r,a) 



[p.q, l = r,a) 



if P = branch{n,p,Vi, . . . ,Vk), t\p = Ci(t„), 
pattern{Vi) = 7r[ci(^)]p, and 
{q,l = r, cr) e X{t,Vi); 

if p = branch{Ti,p,Vi, . . .,'Pk), t\p = f{t^), f ^ T, 
V is a definitional tree for /, and 
{q,l = r,a) e A(f|p,P') 



Then, for all {p, R, a ) e A(t,P), f "^p,R,a t' is a needed narrowing step. In- 
formally, needed narrowing applies a rule, if possible; otherwise, it checks the 
subterm corresponding to the inductive position of the branch: if it is a variable, 
we instantiate it to the constructor of a child; if it is already a constructor, we 
proceed with the corresponding child; finally, if it is a function, we evaluate it 
by recursively applying needed narrowing. For inductively sequential programs, 
needed narrowing is sound and complete w.r.t. strict equations (i.e., both sides 
must reduce to the same ground constructor term) and constructor substitutions 
as solutions [4]. 



4 Optimization by Accumulating Parameters 

In this section, we introduce a new transformation for optimizing functions that 
independently build different sections of a list to be later combined together [13]. 
The development of this section is inspired by the well-known difference-list 
transformation from the logic programming community [12,13]. 

The idea behind the difference- list transformation of [12] is to replace cer- 
tain lists by terms called difference-lists in order to expose opportunities for a 
faster concatenation. A difference-list is represented as a pair of lists whose sec- 
ond component is a suffix of the first. For example, the list 1:2:[] is encoded as 
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the pair (l:2:xs,xs), where xs is a logical variable. Therefore, a difference-list 
represents the list which results from removing the suffix from the first compo- 
nent. Informally, a difference-list can be seen as a “list plus a pointer to its tail” . 
By virtue of the new representation, such a pointer may avoid traversing some 
lists represented by difference-lists, since the concatenation of difference-lists is 
a constant-time operation; append_dl((x, y) , (y, z) , (x, z)). Therefore, predicates 
using append_dl take advantage from its improved runtime, as we now illustrate 
by considering the quicksort algorithm: 

qs([UD- 

qs(x:xs,ys) : — split(x, xs, 1, r), qs(l,z), qs(r,w), append(z, x:w, ys). 

The definition of the predicate split is not relevant here; it is sufficient to know 
that, given a call split(x, xs, 1, r), it returns all the elements of the list xs which 
are lesser than x in 1, and those which are greater than x in r. Following [12], 
the second argument of qs and all the arguments of append need to be changed 
to difference-lists by using the equivalences: 

[] ^ (y> y) 

tp . . . :t„:[] ^ (ti: . . . :tn:y,y) 

X ^(x,y) 

where y is a fresh variable. Thus, we obtain the program: 

qs(xs,ys) : - qs*(xs, (ys, [])). 
qs*([], (ys,ys)). 

qs*(x:xs, (ys,ys')) : - split(x, xs, 1, r), qs*(l, (z, zs)), qs*(r, (w, ws)), 
append_dl((z,zs) , (x:w,ws) , (ys,ys')). 

Note that the first rule is introduced to relate the new predicate qs* and the 
original qs (since the difference- list (ys, []) is equivalent to the standard list ys). 
By unfolding the call to append_dl, we get an improved definition of qs: 

qs(xs,ys) : - qs*(xs, (ys, [])). 
qs*([], (ys,ys)). 

qs*(x:xs, (ys,ys')) : - split(x, xs, 1, r), 

qs*(l, (ys,x:w)), qs*(r, (w, ys')). 

In an attempt to adapt this technique to a functional logic context, we find 
several problems. In particular, a common restriction in lazy functional logic 
languages is to require left-linear rules, i.e., the left-hand sides of the rules cannot 
contain several occurrences of the same variable. In principle, this restriction 
prevents us from encoding the concatenation of difference-lists as a rule of the 
form: append*((x, y) , (y, z)) = (x, z) . Of course, we can transform it into: 

append_dl((x,y) , (w,z)) = (x,z) <;= y == w 

by using a guarded expression (i.e., a conditional expression). However, in order 
to keep the effectiveness of the transformation, the equality symbol “==” should 
be interpreted as syntactic unification, which is not allowed in lazy functional 
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logic programs where only strict equality is permitted. In general, the manipu- 
lation of difference-lists requires the use of non-strict equality in order to assign 
terms to the pointers of difference-lists. Therefore, we are interested in a trans- 
formation process in which the final program does not contain occurrences of 
difference-lists (nor calls to append >i<). 

To achieve this goal, we considered that, in some cases, programs using 
difference-lists are structurally similar to programs written using accumulators. 
For instance, quicksort can be defined using accumulators as follows: 

qs(xs,ys) : - qSacc(xs, [], ys). 

qsacc(D,ys,ys). 

qsacc(x:xs,ys',ys) : - split(x, xs, 1, r), qSacc(r, ys', w), qSacc(l, x:w, ys). 

There are only two differences between this program and the difference-list ver- 
sion. The first difference is syntactic: the difference-list is represented as two 
independent arguments, but in reverse order, the tail preceding the head. The 
second difference is the goal order in the body of the recursive clause of qsacc- 
The net effect is that the sorted list is built bottom-up from its tail, rather than 
top-down from its head [13]. 

Now we show, by means of an example, an adaptation of the difference-list 
transformation to a functional logic language. 

4.1 An Example of the Difference-lists Transformation 

Consider again the quicksort algorithm, but now with a functional (logic) syntax: 

qs(D) = [] 

qs(x:xs) = append(qs(l), x : qs(r)) 

where (l,r) = split(x,xs) 

Here, both qs and split are the functional counterpart of the predicates used 
in the previous section. 

As dictated by the method of [12], the three arguments of the predicate 
append as well as the second argument of the predicate qs should be changed by 
difference-lists. Similarly, in our functional syntax, we will replace the arguments 
of the function append and the result of both functions by difference-lists. From 
the previous section, we know how to transform different kinds of standard lists 
into difference-lists; now, however, we are faced with a new situation which arises 
the question: how can we transform an operation-rooted term into a difference- 
list? To solve this problem, we allow the flattening of some calls by using a sort 
of conditional expressions. The main difference with standard guarded expres- 
sions is that, in order to preserve the semantics, we use a syntactic (non-strict) 
equality for the equations in the condition. In this way, we get the following 
transformed program: 

qs(x) =y (y, []) « qs>K(x) 

qs*([]) =(x,x) 

qs>i<(x:xs) = append>i<((z, zs) , (x:w, ws)) 4= (z, zs) w qs*(l), 

(w, ws) w qs*(r) 



where (l,r) = split(x,xs) 
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By defining the constant-time append* by the rule: 

append*((x,y) , (y,z)) = (x,z) 

we can unfold the calls to append* as follows: 

qs(x) =y <J= (y, 0) « qs*(x) 

qs*([]) =(x,x) 

qs*(x:xs) = (z, ws) 4= (z, x:w) w qs*(l), (w, ws) w qs*(r) 
where (l,r) = split(x,xs) 

In contrast to [12], now we want to remove difference-lists from the program. 
Intuitively, the idea is to detect that, since we only allow difference-lists in the 
result of functions, the second argument of the difference-list is somehow used 
to construct the final result progressively and, thus, we can change it by an 
“accumulating parameter” . 

Also, since the calls to qs* are flattened using a conditional expression, we 
need to move the second argument of the difference-list to the corresponding call 
to qs*. Thus, we obtain the program: 

qs(x) =y ^ y « qsacc(x, 0) 

qSacc(D,x) =X 

qsacc(x:xs,ws) = z ^ z w qSacc(l, x:w), w w qSacc(r, ws) 

where (l,r) = split(x,xs) 

where qs* is renamed as qsacc- Finally, by simplifying the equations in the 
conditions (i.e., by unifying them), we achieve the desired optimization: 

qs(x) =qsacc(x, []) 

qSacc(D,x) =X 

qsacc(x:xs,ws) = qSacc(l,x : qsacc(r,ws)) 

where (l,r) = split(x,xs) 

which gives a similar improvement as the optimized predicate qs* above. Indeed, 
thanks to the use of accumulating parameters, we avoid the traversal of the list 
computed by qs(l) on each recursive call. In general, this optimization is able 
to produce superlinear speedups [12,13]. 

4.2 The Stepwise Transformation 

For the purposes of formal definition and correctness results, our method is 
viewed in a stepwise manner: 

(a) Marking Algorithm: 

Given a function to be optimized, a marking algorithm is applied in order to 
determine which expressions should be replaced by difference-lists. 

1. Input: a program H, and a function f whose result type is a list 

2. Initialization: Ato = {f}> * = 0 
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3. Repeat 

— for each function in A4i, mark the right-hand sides of the rules defining 
f 

— propagate marks among expressions by applying the following rules: 

append(li, 12 ) ^ append(^, 
tl ■ t2 ^ ti it2_ 

g(ll , . . . ,tn) > s(^l ) • ■ • ) 

where g e JF is a defined function symbol different from appendd 

— if there is a marked expression t such that t is a variable, then return 
FAIL; 

else Mi+i = {h | h(ti, ■ ■ ■ ,tk) appears in TZ} 
until Ml = Mi+i 

(b) Introduction of Difference-lists: 

If the marking algorithm does not return FAIL, then we use the function r to 
transform expressions rooted by a marked symbol into difference-lists: 

t{[]) = {y,y) 

=(ti:s, s') where (s, s') '■= r{t 2 ) 

= {y,y') <= {y,y') w f{t^) 

where y,y' are fresh variables not appearing in the program and those occur- 
rences of append whose arguments have been replaced by difference-lists are 
renamed as append*. Furthermore, we consider that all marked function sym- 
bols f in the resulting program are replaced by f*. For instance, a rule of the 
form f (^) = tl ^ t 2 1 f (si;') is transformed into: 

f*(^) = (tl : t 2 : y,y') (y,y') w f*(sH) 

As illustrated in the example of Sect. 4.1, when the transformation of several 
terms gives rise to conditional expressions, all the equations are joined into a 
single condition. The following equation replaces the original definition of f: 

f(^) = y ^ (?/,[])« f*(^) 

Let us remark that the introduction of non-strict equalities does not destroy the 
correctness of the transformation, since they can be seen as a technical artifice 
in this stage but will be removed from the program in stage (e). 

(c) Unfolding of append*: 

The next step consists in unfolding^ the calls to append* using the following 
rule: 



append*((x, y) , (y, z)) = (x, z) 

^ Note that, if the original program is well-typed, a constructor-rooted term c(tH) 
with c different from is never marked, thus we do not consider this case. 

^ In particular, we use an unfolding similar to [14], but using (needed) narrowing 
instead of SLD-resolution (as defined in [2]). 
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Note that this rule is not legal in a functional logic language. It is used during 
the transformation but no calls to append* will appear in the final program. 

(d) Use of Accumulating Parameters: 

Then, we move the second argument of difference-lists to the corresponding 
function call as indicated by these rules: 

[f*(Q = {y,y') ^C] [facc(t^,?/0 = C] 

[t <= (S, s') W f*(t„)] ^ [t ^ S W facc(tn, s')] 

This corresponds to the idea of converting the second argument of the difference- 
lists into an accumulating parameter of the function in which the result will be 
computed. 

(e) Simplification: 

The final step of the transformation simplifies further the program by unfolding 
the (non-strict) equations in the conditional expressions (i.e., by unifying them). 
In this way, we guarantee that all conditional expressions are removed from the 
program, since the first argument of difference-lists is always a free variable. 

Let us illustrate how our strategy proceeds with two examples. As an example 
of complete transformation, consider the following contrived example, which we 
use to illustrate the actions taken by each stage: 

f(D>y) = y:[] 

f(x:xs,y) = append(f (xs,y),x : g(xs)) 

g(D) = D 

g(x:xs) = X : g(xs) 

If we start the marking algorithm with function f , we get the marked program: 

f(D>y) = y i D 

f(x:xs,y) = append(f(xs,y),x : g(xs)) 

g(D) = Il 

g(x:xs) = X : g(xs) 

After replacing the marked expressions by difference-lists: 
f(x,y) = z ^ (z, []) w f*(x,y) 

f*([]>y) = (y:z,z) 

f*(x:xs,y) = append*((z,z') , (x:w,w'}) 4= (z, z') « f *(xs, y), 

(w,w') « g*(xs) 

g(x) = y 4= (y, []) w g*(x) 

g*(D) = (y,y) 

g*(x:xs) = (x:y,y') ^ (y,y') w g*(xs) 

By unfolding the call to append*: 



f*(x:xs,y) = (z,w') 4= (z, x:w) « f *(xs, y), (w, w') « g*(xs) 
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By introducing accumulating parameters: 

f(x,y) = Z ^ Z » facc(x,y, []) 
facc(D,y,z) = y:z 

facc(x:xs,y,w') = Z ^ Z W f acc(xs, y, x:w), W W gacc(xs,w') 

g(x) = y ^y «gacc(x, 0) 
gacc(D,y) = y 

gacc(x:xs, y') = x:y ^ y w gacc(xs, y') 

Finally, by unfolding the conditions, we get: 

f(x:,y) = facc(x,y, []) 
facc(D,y,z) = y:z 

facc(x:xs,y,w') = f acc(xs, y, x:gacc(xs, w')) 
g(x) = gacc(x, []) 

gacc(D,y) = y 

gacc(x:xs,y') = x:gacc(xs,y') 

Intuitively, the effect of the transformation is that, in the resulting program, the 
operations over the input list to f are mixed up, while in the original program 
they were built independently (and then combined by the function append). 

As an example of program to which the transformation cannot be applied, 
consider the double append program: 

dapp(x, y, z) = append(append(x, y), z) 

app([],y) =y 

app(x : xs,y) = X : app(xs,y) 

If we start the marking algorithm with the function dapp, in the first iteration 
we get the following marked rule: 

dapp(x,y,z) = append(append(x, y), z) 

Therefore, stage (a) incurs into FAIL since the variables x, y, and z have been 
marked. Note that by allowing stage (b) (as it actually happens in the original 
difference-list transformation), we would obtain the following definition of dapp: 

dapp((x,xs),(xs,ys),(ys,z)) = (x,z) 

However, stage (c) could not remove the difference- lists of the arguments of dapp 
and, thus, we would produce a non-legal program. 

Notice that, even if the marking algorithm does not return FAIL, improve- 
ment is not guaranteed (although there is no significant loss of efficiency in these 
cases, see the function g in the example above). In order to always guarantee 
runtime improvement, stage (a) is only started with functions whose definitions 
are of the form append(ti, 12 )', this way we ensure that, if the method is actually 
applied, at least one call to append will be removed and, consequently, some gain 
will be achieved. Let us note that functional (logic) programmers routinely use 
append to concatenate intermediate results. Therefore, the optimization pays off 
in practice. 
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4.3 Correctness 

The correctness of the transformation can be derived from the correctness of 
stages (b) and (d), since the remaining stages do not modify the program (stage 
a) or are instances of the fold/unfold framework of [2] (stages c and e). In the 
following, we develop a proof sketch for stages (b) and (d) under certain condi- 
tions on the form of difference- lists (i.e., only lazily regular lists are allowed in 
the first argument of append*, see below). 

To prove the correctness of stage (b), we first need to define an adequate 
semantics for conditional expressions in transformed programs. Basically, it can 
be provided as follows. Let us consider an initial (marked) program TZa and the 
program ??.{, obtained from applying stage (b) to Tla- Now, we introduce the 
following function t': 

r'(D) = D 

r'(ti:^) =ti\T'{t2) _ 

T'Uitu)) = y ^ f(tn) 

This function is used to transform the initial program TZa into a modified ver- 
sion TZ'^ with the same structure than TZb, but without difference- lists. It should 
be clear that each needed narrowing derivation in TZa can be mimicked in TZ'^, 
since the only difference is that some expressions containing nested function 
symbols have been flattened into (non-strict) equalities. This way, we can define 
the semantics of conditional expressions in TZb in terms of the associated needed 
narrowing steps in the original program TZa (via the equivalence with TZ'a). Fur- 
thermore, when evaluating terms in TZb, we allow the flattening of expressions, 
as well as the unfolding of equations, in order to preserve the equivalence with 
the computations in TZa- 

Once the interpretation of conditional expressions is fixed, we can establish 
the following equivalence between derivations in TZa and TZb where no call to 
append occurs. Given an operation-rooted term e = f (G, . . . ,t„) such that f is 
marked by the algorithm in stage (a), then 

in iff e' {d'.y,y) inTZb{*) 

where e' = f*(ti, . . . ,tn), a = a' [Var(e)], and d represents a (possibly empty) 
sequence of elements of the form d\: . . . :dk, k > 0. Note that, by the definition 
of the marking algorithm, the terms G, . . . cannot contain marked function 
symbols. This equivalence can be easily stated by induction on the length of 
the derivations, by considering the following three facts: i) no calls to append 
(resp. append*) are produced in the first (resp. the second) derivation; ii) the 
left-hand sides of the applied rules are the same in both derivations since they 
are not changed by stage (b); and iii) the modifications in the right-hand sides 
can be easily proven from the equivalence between lists and difference-lists and 
the interpretation of conditional expressions. Therefore, we center the discussion 
on the correctness of the function append*. 

In [12], the notion of regular difference-list is introduced to ensure the cor- 
rectness of append*; namely, only calls to append* with a regular difference-list 
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in the first argument are allowed. Essentially, a difference-list is regular if it is 
of the form {t\'. . . . :tn'y, y) and y does not appear in ti, . . . , i.e., if it denotes 

a finite list (here ti: . . . This notion of regularity is not appropriate in our 

context due to lazy evaluation, since we can have calls to append>i< with a non- 
regular difference-list in the first argument, and still preserve correctness if this 
argument is evaluated to a regular difference-list afterwards. To overcome this 
restriction (which would reduce drastically the number of programs amenable 
to be transformed), we introduce a lazy version of regular list as follows. Given 
an expression e[di]p containing a difference-list d\ at some position p, we say 
that d\ is lazily regular in a derivation e[di]p e' iff a{di) is regular (i.e., of 
the form (ti: . . . :tn-y, y))- Now, by using the notion of lazily regular lists, we can 
state the correctness of append* as follows.^ Let ei,e2 be expressions with no 
calls to append and let e{ , 62 be the corresponding expressions which result from 
replacing each call to a marked function f by the corresponding call f *. Then, 

append(ei, 62) d:a{e2) in TZa 

iff 

a.ppend*{{x, X s) ,{y,ys)) <= (x,xs) x e[, {y,ys) ps e'2 
{d-y,ys) (y,ys) a'{e'2) in Tib 

where ei is lazily regular in the second derivation, a = a' [Var({ei, 62})], and d 
represents a (possibly empty) sequence of elements of the form d\: . . . :dk, k > 0. 

Let us prove the claim by considering both implications: 

(=>) Consider the derivation append(ei, 62) d\a{e2) in TZa- By definition of 

needed narrowing, it is immediate that e\ d:[]. By equivalence (*), we have 
e'l {d:z,z) in TZb, where a = a' [Vor(ei)]. Therefore, 

append*((a;, xs ) , {y, ys)) <= {x, xs) « e(, {y, ys) « 

'^{xs^y} (x, ys) {x, y) w e{, {y, ys) w e'^ 

(x, ys) ^ (x, y) w Id'.z, z ) , (y, ys) w o-'(e^ 
'^{x^d-.y^z^y-] {d-y,ys) ^ (y,ys) w o-'(e^ 

and the claim follows. 

(4=) Consider the derivation 

append*((x, xs ) , (j/, ys)) 4 = (x, xs) « e^, (j/, ys) « 

{d--y,ys) <= (y,ys) « cr'(e^) 

Since e( is lazily regular, we have e( (d-Z, z) in TZy. Hence, by equivalence 
(*), ei d:[] in TZa^ where a = a' [Var(ei)]. Although the evaluation of e\ and 
the calls to append are interleaved due to the laziness of append, we know that 
append(ei, 62) d'.a{e2) by definition of needed narrowing, which completes 

the proof. 

Note that requiring e( to be lazily regular is not a real restriction in our 
context, since terminating functions fulfill this condition by the manner in which 

® Here we do not consider nested occurrences of append, although the proof scheme 
can be extended to cover this case by using an appropriate induction. 
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we introduce difference- lists in the base cases of recursive functions. On the other 
hand, if we were only interested in proving an equivalence w.r.t. head normal 
forms, we conjecture that this restriction could be safely dropped. 

Now we concentrate on stages (d) and (e). Let TZc be the program obtained 
from stage (c) and TZe be the output of stage (e). In order to prove the correctness 
of this step, we prove that for each function symbol f in TZ^ (defined in terms of 
some f *) we have a semantically equivalent function f in TZg (defined in terms of 
face)- For the sake of simplicity, let us consider a recursive function of the form: 

f(^) =y <= {y, []) w f*(^) 
f*(^) = {d-.y,y) 

f*{bn) = {d'-.z,y) <= {z,e:y) w f*(^) 

in TZc, where d,d! ,e represent a (possibly empty) sequence of elements of the 
form d\: . . .:dk, k > 0. According to the transformation rules in stages (d) and 
(e), we produce the following definition: 

f(^) =facc(^, []) 

facc(^, y) = d:y 

^ a.cc{bji, y) — d .facc(^ri5 

in TZc- Given a (finite) list ci:...:cfc:[], in order to prove that f (cjf) ci:...:cfe:[] in 
TZc iff f (cjf) ci:...:cfc:[] in TZe, it suffices to prove that f >)'(cir) {ci:...:ck'y, y) 

in TZc iff facc(clT, []) ci:...:Cfe:[] in TZe- To prove this claim by induction, we 

first generalize it as follows: 

{t-.z,y) <= ( 0 ,r:y) « f>K(cjr) {t'd:r':y,y) in 

iff t:facc(cjT,r:[]) t' :l:r' :[] in 7^e 

where a = a' [Var({f *(cjf), t, r})], the expressions t,r,l represent (possibly 
empty) sequences of elements of the form t\-. . . . :tk, k > 0, and t',r' are con- 
structor instances of t,r (in particular, t' = cr{t),r' = <j{r)). Now we proceed by 
induction on the length of the former derivation. The base case is immediate by 
applying the first rules of f * and face respectively. Let us consider the inductive 
case. By applying the second rule of we have: 

{t:z,y) ^ ( 2 ,'r:y) ~ f*(cjT) -^0 {t':z,y) ^ {z,r':y) k {d':z' ,y') , 

{z',e:y') w f*(sjT) 

and, by unfolding the first equation in the condition: 

{t'-.d':z',y) <= ( 2 ;',e:r':y) w f*(^) 

On the other hand, by applying the second rule of face to t:f acc(cn, ^:[])> we have: 

f -f acc (On; ^- [] ) t .d .f acc(^ri 5 ■[]) 

where 9 = 6' [Var{{f*{cT^),t, r})]. The claim follows by the inductive hypothesis. 
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f(^) = D 


f(5qi) = facc(xii, []) 

facc(^, y)=y 


f(tn) = mi : append(f (t'),m2 : [])) 


facc(tn,y) = mi : face (U, m2 : y)) 


f(^) = D 


f(5qi) = face (5^,0) 

facc(^, y)=y 


f(tn) = mi : append(f (t'),m2 : f(t'')) 


facc(tH, y) = mi : facc(t;,m2 : facc(t",y)) 


f(^) = D 


f{t„) = mi : append(append(f (ti),m2 : 


: f(t'')),m3 : []) 


=4- 


f(5qi) = facc(xii, 0) 


facc(Sn,y)=y _ 

face (til, y) =mi : facc(t',m2 : face (t", m3 : y)) 

where mi, m2, m3 are (possibly empty) sequences of the form d\ : (I2 ■ ■ ■ ■ ■ dk, with 


A: > 0. 





Fig. 2. Matching scheme 

4.4 Effectiveness of the Transformation 

Throughout this section, our aim has been to define an automatic method for 
achieving the effect of the difference-list transformation over functional logic 
programs. We have not been concerned with the efficiency of its implementation. 
It turns out that some of the stages that we have introduced appear to be 
expensive to implement. Thus, for a first attempt of integrating the method 
into a real compiler, we have defined a matching scheme which is both simple 
and effective. For our transformation, we discovered that, in practice, many 
doubly recursive functions ensure a gain in efficiency from the transformation 
(also some single recursive functions, provided they use append to concatenate 
some elements to the result of the recursive call). These functions are matched 
by three simple transformation rules (depicted in Fig. 2) and, thus, replaced by 
equivalent functions without calls to append. 

As an example, we consider the towers of Hanoi: 

hanoi(0, a, b, c) = [] 

hanoi(S(n), a, b, c) = append(hanoi(n, a, c, b), (a, b) : hanoi(n, c, b, a)) 

The first argument is a natural number (constructed from 0 and S), a, b and 
c represent the three towers, and (a, b) a movement of a plate from a to b. By 
considering that mi is an empty sequence and m 2 = (a, b), the second rule of 
the scheme matches and transforms the program into the following optimized 
version without concatenations: 

hanoi(n, a, b, c) = han(n, a, b, c, []) 
han(0,a,b, c,y) =y 

han(S(n), a, b, c, y) = han(n, a, c, b, (a, b) : han(n, c, b, a, y)) 

Note that all the concatenations have actually disappeared. 
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5 Experimental Evaluation 

In order to evaluate experimentally our transformation, we have incorporated the 
optimization based on the matching scheme of Fig. 2 into the PAKCS compiler 
for Curry [8] as an automatic source-to-source transformation which is transpar- 
ent to the user. The language Curry is an international initiative to provide a 
common platform for the research, teaching and application of integrated func- 
tional logic languages [9]. To implement the optimization, we have used the 
standard intermediate representation of Curry programs: FlatCurry [8].^ 

To perform the experiments, we considered programs which are used in the 
literature to illustrate the benefits of difference-lists in Prolog (adapted to a 
functional logic syntax). The complete code of the benchmarks and a detailed 
description of the implementation can be found in [1]. The following table shows 
the performances of the original programs (Original) w.r.t. the improved versions 
(Optimized) by the introduction of accumulating parameters: 



Benchmarks 


Original 


Optimized 


Speedup 


revzooo 


3470 


65 


53.38 


qsort 2 ooo 


1010 


850 


1.18 


pre-order 2 ooo 


104 


17 


6.11 


in-order 2 ooo 


105 


16 


6.56 


post— order 2 ooo 


132 


16 


8.25 


hanoiiY 


4100 


2160 


1.89 



Times are expressed in milliseconds and are the average of 10 executions. Run- 
time input goals were chosen to give a reasonably long overall time. In particular, 
goal subindices show the number of elements in the input lists or trees. Column 
Speedup shows the relative improvements achieved by the transformation, ob- 
tained as the ratio Original y- Optimized. Results are encouraging, achieving 
significant speedups for some of the examples. 

6 Related Work 

The development of list-processing optimizations has been an active research 
topic both in functional and logic programming for the last decades. A related 
approach to difference-lists appeared early in [11], where Hughes introduced an 
optimized representation of lists, the so-called abstract lists, which are specially 
defined for a fast concatenation in functional programming. The idea behind 
their use is similar to that of logic difference- lists, although they are formulated in 
a different way. As opposite to our approach, the goal of [11] is not to provide an 
automatic algorithm to replace standard lists by abstract lists, but to introduce 
an efficient data structure to be used by the programmer. The idea of optimizing 
concatenations was taken one step forward by Wadler in [15], where he described 

A prototype implementation, together with some examples and documentation of 
the system is publicly available at: http://Hww.dsic.upv.es/users/eip/soft.htmi. 
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local transformations for removing some concatenations from a program. The 
formalization of our stepwise process to introduce accumulating parameters is, 
apparently, not related with Wadler’s transformation. Nevertheless, we strongly 
believe that over many examples both approaches produce a similar effect. A 
formal comparison between them could be useful. For instance, we think that 
our marking algorithm could be used within Wadler’s technique to identify those 
functions from which concatenations will be successfully removed. On the other 
hand, we could benefit from the simplicity of Wadler’s rules in some steps of our 
transformation. 

The above techniques optimize functions that independently build different 
sections of a list to be combined later together. Apart from that kind of opti- 
mizations, there are a number of program transformations for optimizing list- 
processing functions which use some intermediate list to compute the result. The 
most popular such transformations are: Wadler’s deforestation [16] and the short 
cut to deforestation of [6]. Also, [1] adapts the short cut deforestation technique 
to a functional logic setting. Although the techniques are different, their aim is 
essentially the same: to transform a function into another one which does not 
create intermediate lists. Such optimization techniques are complementary with 
the difference-list transformation in the sense that, generally, if a program can 
be improved by one of them, then, the other is not effective and vice versa (see 
[1] for details). 

7 Conclusions and Future Work 

In this paper, we have presented a novel transformation for improving list- 
processing functions in the context of a multi-paradigm functional logic lan- 
guage: an automatic transformation based on the introduction of accumulating 
parameters. Furthermore, it has been shown practical and effective by testing it 
within a real functional logic compiler, the PAKCS compiler for Curry [8]. 

The concept underlying difference-lists is the use of the difference operation 
between (incomplete) data structures to represent (partial) results of a computa- 
tion. This could also be applied to other recursive data types apart from lists (see, 
e.g., [12,13]). A promising direction for future work is the generalization of our 
stepwise transformation to arbitrary (algebraic) data types. Another interesting 
topic is the definition of abstract measures to quantify the performance of func- 
tional logic programs, i.e., measures independent of concrete implementations. 
We expect that these measures also shed some light to find new optimizations 
and to determine their power. 
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Abstract : This paper presents a new logical topology GIADM-net, a generalised 
lADM network for enhancing the reliability of optical networks using wavelength 
division multiplexing. The presence of multiple number of paths of same distance 
between any two nodes in exchange of reasonable number of hops in the network, 
ensures a higher degree of reliability compared to other existing topologies in case 
of link failure, as well as in balancing link loading in the network so as to 
maximise the network throughput. This GIADM-net, connects any arbitrary no. of 
nodes in a regular graph as opposed to the cases in De Bruijn graph and 
shufflenet. The average hopping distance between two nodes, using this topology 
is smaller, compared to that in GEM net, shufflenet and De Bruijn Graph, at the 
cost of marginal increase in diameter. 



I. Introduction 

Optical networks[4] use interconnection of high speed wide-hand fibers for 
transmitting information between any source-destination pair of nodes. The huge 
bandwidth of the optical fibers has been exploited by the wavelength division 
multiplexing (WDM)[6] approach, where several communication channels operate at 
different carrier wavelengths on a single fibre. End-users in a fibre-based WDM 
backbone network may communicate with one-another via all-optical channels, which 
are referred to as lightpaths[l]. A lightpath between end-nodes is a path between them 
through router(s) or star coupler(s) using a particular wavelength (often called a 
channel) for each segment of the fibre traversed. In an N-node network, if each node 
is equipped with N-1 transceivers and if there are enough wavelengths on all fiber 
links, then every node-pair could be connected by an all optical lightpath. But the cost 
of the transceivers dictate us to equip each node with only a few of them resulting to 
limit the number of WDM channels in a fiber to a small value d . Thus, only a 
limited number of lightpaths may be set up on the network. Typically, the physical 
topology of a network consists of nodes and fibre-links in a broadcast star or ring or 
bus. On any underlying physical topology, one can impose a carefully selected 
connectivity pattern that provides dedicated connections between certain pair of 
nodes. Traffic destined to a node that is not directly receiving from the transmitting 
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node, must be routed through the intermediate nodes. The overlaid topology is 
referred to as multihop logical topology. Gem-net[2], de-Bruijn graph[l], Shuffle- 
net[2] etc. are examples of such existing multihop[10] logical topologies. The 
lightpath definition between the nodes in an optical network or the logical topology is 
usually represented by a directed graph (or, digraph) G = (V,E) (where V is the set of 
nodes and E is the set of edges) with each node of G representing a node of the 
network and each edge (denoted by u->v) representing a lightpath from node u to node 

V. 

This paper considers the problem of enhancing the reliability of an optical network 
[4], [6], [7], [3], by overlaying a new logical topology over a wavelength routed all 
optical network physical topology and also compares the features of the existing logical 
topologies with this new topology GlADM-net. 

A GIADM-net is a generalised lADM network [5], used as a multihop logical 
topology. The characteristic feature of having multiple numbers of paths of same 
distance between any two nodes in exchange of a reasonable number of hops in the 
network, is the major criteria for selecting such logical topology. The major problems in 
the optical network such as link failure or imbalance in link loading in the network can 
be handled properly by using this multipath property of the network, ensuring a higher 
degree of reliability compared to other existing logical topologies. Any arbitrary number 
of nodes can be connected in a regular fashion as in GEM-net but as opposed to the 
cases of De Bruijn Graph and Shufflenet. The property of lADM-network has been 
extended here to connect any number of nodes in the optical network. The average 
hopping distance between two nodes in GIADM logical topology is smaller compared 
to that the GEM-net, shufflenet and De Bruijn Graph at the expense of marginal increase 
in diameter of the regular graph. 

After describing the GIADM net architecture, we study its construction, diameter, 
average hopping distance, number of paths between any source and destination and 
routing[8],[9]. A comparative table has been formed to discuss about the different 
properties of these topologies, e.g., De Bruijn Graph, GEM net and GIADM-net. 

2. GIADM-Net Architecture and the Interconnection Pattern 

GIADM-net is a regular multihop network architecture, which is a generalization of 
lADM-network. The network is a generalized one from the sense that any number of 
nodes may be cormected with this topology Fig. 1 shows a logical GIADM-net topology 
coimecting 6 nodes overlaid on the physical star topology network. A GIADM-net 
connecting N number of nodes are arranged in K number of columns, using the 
relationship : (k-1) * (2'^'* - 1) < N < k * (2*^ - 1). 

Each of (k - N mod k) number of columns has M = N div k number of nodes, 
whereas each of the remaining number of columns has (N div k + 1) number of nodes. 
Nodes in adjacent columns are arranged according to a generalization of the lADM 
coimectivity pattern using directed links. The generalization allows any number of nodes 
in a column as opposed to the constraint in lADM network. The number of links from 
each node is 3. Node (c, r) where c is the column number and r is the node number is 
connected to node (c', (r ± 2‘) mod M) and node (c',r) and c' = (c+1) mod k where i = 0, 
1,2, ...,k-l. 
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2.1 Diameter 

The diameter of a GlADM-net is obtained as follows : 

Let CNij be the total number of nodes in stage j connected to a single node at stage i 
by the interconnection. CNij+i mod k is related to CNij by the following recurrence 
relation 

CNi,(j+i) mod k 2 CNij 1, and 
CNi,i = 1 (trivial case) 

CNi,i+k-i = 2 CNi4+k-2 + 1 

= 2 (2 CN,,+k-3 + 1) + 1 



= 2 (2 (2 (2 .... (2 CN.,i + 1) + 1)+ 1) + + 1) + 1 

= 2 *“' + 1 + 2 + 2 ^ + + 2 *“' 

= 2’‘-l 

So, starting at any node, each and every node among 2^ -1 nodes in a particular 
column can be reached for the first time on (k-l)th hop. So, all these nodes which are not 
covered in the previously visited column will be finally covered in an additional (k-1) 
hops. Thus, maximum number of hops to be taken by the packets of information to 
communicate a source to a destination through shortest path in the network is D = k - 1 + 
k-1 = 2k-2. 

2.2 Average Hopping Distance in a GIADM-Net Can Be Obtained as Follows : 

i (i+k-l)modk 

(Z ( S J CN,j + ( 2' -l-CN,j)0 + k ) )modk))2/N(N-\) 

k-l 

[2] shows the superiority of GEM-net over De Bruijn graph, regarding average hopping 
distance. 

Some comparative results containing diameter and average hopping distance for 
different values of N of a GlADM-net and GEM-net are shown below : 



Number 

ofNodes 


Number of 
Columns k 


Average Hopping 
distance 


Diameter 


GEM-net 


GIADM 


GEM-net 




GEM 


GIADM 


8 


1-4 


3 


2.000-2.286 




221 


4 


24 


1-4 


4 


3.261-3.478 


3.086 


5-6 


6 


64 


1-8 


5 


4.448-5.714 


4.206 


6-10 


8 


256 


1-8 


6 


6.336-7.561 


6.223 


8-12 


10 


1024 


1-8 


8 


8.297-9.517 


8.22 


10-14 


14 



2.3 Multiple Number of Node-Disjoint Paths between a Source and a Destination 

Majority of the nodes are connected to a source node by shortest distance through 
multiple number of node-disjoint paths. The remaining nodes, connected to the source 
node by single shortest path through h number of hops, are connected to the same by 
multiple number of node-disjoint paths through h+k number of hops. It can be shown 
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easily, that any node can be eonnected to any other node in the network through multiple 
number of node-disjoint paths requiring at most D+1 = 2k-l number of hops. 

2.3.1 Special Case : Computation of Number of Paths of GIADM Connecting k * 
(2*^-1) Number of Nodes Arranged in k Number of Columns : 

Let the source node (i,p) is the node in pth position of ith stage. Let np(i,p).(j,q) be the 
number of paths connecting the source node (i,p) to the destination node (j,q). 

Following relationship holds good till j+1 mod k = i+k-1 

i) np(i,p)-(j+i,r) = np(,,p)-(j,q) 

when q = (p± 2'.m) mod (N div k) 

and r = (p ± 2' * CNij div2 + m + 1) mod (N div k) 

for 0 < m < CNij div 2 

ii) np(i_p)-(j+i,p) — np(i,p)-(jq) 

iii) np(i_p).(i+i j) — np(i_p).(j,r) + nP(i,p)-(j,t) 

where r = (p ± m.2’) mod (N div k) 

and t = (p _ (CNij div 2). 2‘ + (m-l).2‘) mod (N div k) 

for 1 < m < CNij div 2 

and also, 

np(i,p)-(i+i,q) = 1 whereas q = p + 2'.m and m = 0,1 

After computing the number of paths from the source node (i,p) to all the nodes 
which are accessible in (k-1) hops, the number of paths to all other nodes which are 
accessible in greater than k-1 hops, can be calculated by using the following recurrence 
relation : 

IiP(i+l modk, q) ^ rrp (j^q) + Upj jq+^') mod (N divk)) “b HP0,(q-^) mod (N divk)). 

The above relation is used till (j+1) mod k = 2k-2 

2.4 Routing 

A very simple routing scheme has been adapted here, using the scheme for lADM. 

Let S (i,p) and D (j,q) be the source node and the destination node respectively. Where 0 
< i, j < k - 1 and 0 < p, q < 2*‘ - 2 



2.4.1 Case 1 : j > i 

a) Destination nodes which are reachable from the source node in less than or equal to 
(k-1) hops : 

To be able to specify any arbitrary path in the GIADM net, at least 2*(j-i) numbers of 
bits are required. The (j-i) low order bits represent the magnitudes of the route and the (j- 
i) high order bits represent the signs corresponding to the magnitudes. 

Given a source input S and a routing tag F of length 2(j-i), the destination D can be 
calculated as : 






D = 






mod (N div k) 



b) Destination nodes which are reachable in > (k-1) hops : 

Routing Tag F is of length 2 * (j-i+k). The destination D can be calculated as 
follows 
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D = 



k^I 

s=i 



i-1 

s=0 



* 



/ 



k-i+s 




mod (N div K) 



2.4.2 Case 2 : j < i 

a) Destination nodes which are reachable from the source node in less than or equal to 
(k-1) hops : 

Routing tag F is length : 2 * (k-i+j) 



k-I 

D = + ' * 

s=i 






i-i 

* 2 ‘' + ^ (- 1 f 

s=0 



* f 

J 2k-2Hj-^s 



* 2'7 niod (N div k) 



b) Destination nodes which are reachable in greater than (k-1) hops : 

Routing tag F is of length : 2 * (2k - i + j) 

D = [s + Y,(-^ * fs-i * * fk-iks * 

s=i 5=0 



j-i 

+ ^(-1 * 2" 7 mod (NdvK) 

s=0 



3. Comparative Table Showing the Salient Features of Different 
Topologies with That of GIADM-Net 



Different Parameters 


De Bruijn Graph (A,,D) 


*GEM net 


GIADM-net 


Nodal degree (k) 


2,3,4 


2,3,4 


3 


Number of links 


2N, 3N, 4N 


2N, 3N, 4N 


3N 


Number of shortest 
paths between a source 
destination pair 


1 


1 for the best 
configuration 


> 1 


Maximum number of 
node-disjoint paths 

between a source 

destination pair 


(k-1) for nodal degree k 


Not discussed 


3 (more than 
one 

combination 
may exist) 


Scalability 


No 


Yes 


Yes 


Average hopping 

distance 


di (say) 


Di (say) 

d 2 <di for the best 
configuration 


ds (say) 
ds < d2 



• Data collected from [2] 

• The table shown above clearly mentions the following facts: 

i) Number of shortest paths in GIADM-net between a source-destination pair is higher 
than that in de-Bruijn graph, and also that in GEM-net for its best configuration. 
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ii) Maximum number of node-disjoint paths between a source-destination pair in de- 
Bruijn graph is almost equal to its degree, whereas in GIADM-net, there exists more than 
one combination of 3 node-disjoint paths at a time. 

iii) Average hopping distance is smallest in GIADM-net in comparison with other two as 
obtained from the statistics. 

iv) GIADM-net may be scalable one as like GEM-net. 

v) Nodal degree of each node is 3 in GIADM-net, whatever be the size of the network, 
unlike in other existing topologies. Total number of links in GIADM is 3N which is less 
than those in other topologies. 

4. Conclusion 

The proposed new topology GIADM-net, a generalised lADM work, introduces a higher 
degree of reliability in multihop WDM optical network, compared to the existing 
topologies such as De Bruijn Graph and shufflenet, by having multiple number of node 
disjoint paths of same distance between any two nodes in exchange of reasonable 
number of hops in the network. This GIADM-net was designed to be scalable like 
GEMnet and as opposed to the cases in De Bruijn Graph and shufflenet. The paper also 
summarises the information about how to compute the number of paths between any 
source destination pair, of a GIADM-net in a special case and also the respective routing 
scheme. The comparative table at the end of the paper shows that average hopping 
distance in GIADM-net is smallest among the existing topologies, and more than one 
combination of three node-disjoint paths may exist at best for a source destination pair in 
GIADM-net, whereas only one combination of the node disjoint paths exist in De Bruijn 
Graph in exchange of higher complexity of the network, compared with GIADM-net. 
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Abstract : A new scalable logical topology for multihop optical networks has 
been presented in this paper based on de-Bruijn graph. The de-Bruijn graph 
having simple routing property is a regular non-scalable logical topology 
having diameter to be of logarithmic value of number of nodes. The proposed 
topology adds the advantage of scalability over those in de-Bruijn graph, 
keeping perturbation in the network to a very low level during insertion of 
nodes, at the cost of marginal variance in degree of the network. 



I. Introduction 

High speed wide-hand fibers [10] in optical networks[8],[9] are used for transmitting 
information between any source-destination pair of nodes. In this regard, wavelength- 
division-multiplexing(WDM)[l] is the approach where several communication 
channels operate at different carrier wavelengths on a single fibre, exploiting the huge 
bandwidth of the optical fibres. End-users in a fibre-based WDM backbone network 
may eommunicate with one-another via all-optical [6], [7] channels, which are 
referred to as lightpaths. A lightpath between end-nodes is a path between them 
through router(s) or star coupler(s) using a particular wavelength (often called a 
channel) for each segment of the fibre traversed. In an N-node network, if each node 
is equipped with N-1 transceivers and if there are enough wavelengths on all fiber 
links, then every node-pair could be connected by an all-optical lightpath. But the cost 
of the transeeivers dictate us to equip each node with only a few of them resulting to 
limit the number of WDM channels in a fiber to a small value d. Thus, only a limited 
number of lightpaths may be set up on the network. 

Typically, the physical topology of a network consists of nodes and fibre-links in a 
broadcast star or ring or bus. On any underlying physical topology, one can impose a 
carefully selected connectivity pattern that provides dedieated connections between 
certain pair of nodes. Traffie destined to a node that is not directly receiving from the 
transmitting node, must be routed through the intermediate nodes. The overlaid 
topology is referred to as multi-hop logical topology. GEM-net [2], de Bruijn graph 
[3], shuffle-net [2] etc. are examples of such existing multi-hop logical topologies. 
The lightpath definition between the nodes in an optical network or the logical 
topology is usually represented by a directed graph (or digraph) G= (V,E) (where V is 
the set of nodes and E is the set of edges) with each node of G representing a node of 
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the network and each edge (denoted by u->v) representing a lightpath from node u to 
node V. The desirable criterion of this graph is (1) small nodal degree for low cost 
(ii) simple routing scheme for avoiding the need of complex routing tables (iii) small 
diameter for faster communication (iv) growth capability with least possible 
perturbation in the network . The de Bruijn graph being a regular topology and having 
a structured node-connectivity have simpler routing [4] schemes and can support a 
large number of nodes with a small diameter and small nodal degree . However, 
scalability [5] remains a problem with such regular structures where number of nodes 
in such a network is defined by some mathematical formulae. Irregular multi-hop 
structures generally address the optimality criterion directly, but the routing becomes 
complex due to the lack of structural connectivity pattern. 

Our topology, based on topology of de-Bruijn graph seeks to address a solution to 
this problem. For any integer d and k, the topology becomes a de-Bruijn graph when 
number of nodes in the network equals d*‘ and (2d)*‘, where d and 2d are the degree of 
the graphs respectively and k is the diameter. In the situation where d*‘ < number of 
nodes < (2d)^ the irregular graph structure while keeping its diameter to the value k, 
still maintains a simple routing scheme. Again, during insertion of nodes one after 
another, perturbation as well as variance of degree in the network is kept at a low 
level. 

Our topology, based on topology of de-Bruijn graph seeks to address a solution to 
this problem. For any integer d and k, the topology becomes a de-Bruijn graph when 
number of nodes in the network equals d*‘ and (2d)*‘, where d and 2d are the degree of 
the graphs respectively and k is the diameter. In the situation where d*‘ < number of 
nodes < (2d)*‘, the irregular graph structure while keeping its diameter to the value k, 
still maintains a simple routing scheme. Again, during insertion of nodes one after 
another, perturbation as well as variance of degree in the network is kept at a low 
level. 



2. Proposed Interconnection Topology 

The proposed interconnection topology gets the structure of a de-Bruijn graph when 
the number of nodes (N) in the network equals d'^. This directed graph has the set of 

nodes {0,1,2, ,d-l}'‘ with an edge from node aiU 2 aj^ to node bib 2 ....bk if and 

only if the condition bj = aj+i is satisfied where a^, bj belongs to set A and 
A=(0,l,2,...,d-1}, 1<= i <=k-l. Each node has indegree and outdegree d and the 
diameter of the graph is k. Fig. 1 shows a 2^ de-Bruijn graph with degree 2 and 
diameter 3 . 

The proposed topology also assumes the structure of a de Bruijn graph when the 
number of nodes(N) in the network equals (2d)*‘ , where 2d is the indegree or 
outdegree of each node and k is the diameter with the set of nodes {0,1,2,. . ..,2d-l}*‘ , 

with an edge from node aja 2 aj^ to node bib 2 ....bk iff the condition bi= aj+i is 

satisfied where ai,bj belongs to Z , Z={0,1,2,. ..2d-l}, l<=I<=k-l.We now consider a 
setX=Z-A= {d, d+1, ,2d-l}. 
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Figure 1: A de Bruijn graph with 2^ nodes, d = 2 and k = 3 



The proposed interconnection topology , when d'^ < N < (2d) , assumes an 

insertion strategy of the nodes to he connected in the network in k number of phases 
as shown in table 1 (where subscript denotes the value and superscript denotes the 
position in its representation) : 

Table 1 




Each phase i is divided into 2'"' no. of subphases. Node patterns in each phase 
follow the sequence as mentioned below: 
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Table 2 



Phase 1 


0 1 2 k-2 k-1 

ao ai a2 ....a k-2 Xk-i 


Phase 2 


0 i 2 El El El 

ao ai a2 ....ak-3 Xk-2 ak-i 

0 12 k-3 k-2 k-1 

ao ai a2 ....ak-3 Xk-2 Xk-i 


Phase 3: 


0 i 2 El El El El 

ao ai a2 ....ak-4 Xk-3 ak-2 ak-i 

0 12 k-4 k-3 k-2 k-1 

ao ai a2 ....ak-4 Xk-3 ak-2 Xk-i 

0 12 k-4 k-3 k-2 k-1 

ao ai a2 ....ak-4 Xk-3 Xk-2 ak-i 

0 12 k-4 k-3 k-2 k-1 

ao ai a2 ....ak-4 Xk-3 Xk-2 Xk-i 



The pattern is repeated in subsequent phases. Generally speaking, all the nodes in all 
phases and sub-phases follow strictly the values in ascending order of their insertion 
sequence. 

Example: Let d=2,k=2; A={0,1}; X={2,3}; Z={0,1,2,3} 

Already existing nodes in d’^ de-Bruijn graph : 00,01,10,11. Nodes to be inserted 
would follow the insertion sequence: 

Phase 1: 02,03,12,13 

Phase 2: (a) 20,21,30,31 
(b) 22,23,32,33 

Insertion of all those nodes in the network would lead to form a (2d)’' de Bruijn graph. 

2.1. Lemma 1: At least d number of real parents of a node (to be inserted next) 
always exist. 

Proof: The real parents of a node to be inserted in i* phase (l<=i<=k) having the 

pattern as ao°ai'a 2 ^... a k_i_i Xk_i'^''zk_i+i'‘''^’ can be evaluated following 

the convention in de-Bruijn graph as : *ao’ai^... a k_i_i ’'■* Xk_i*‘‘‘^'zk_i+i''''^^ Zk_ 2 '‘'’ , 

where * in 0* position of the pattern can take any value from Z. 

The above representation of parents show that they have been inserted in i-lth 
phase, when * belongs to A. For i=l, the real parents are available from d’' de-Bruijn 
graph. So, at least d numbers of parents are always available for a node during 
insertion. 

2.2. Lemma-2: The real children of a node following de-Bruijn graph convention 
during insertion may not exist. 

Proof: Real Children of a node to be inserted in i* phase having representation 

ao°ai’a 2 ^... a may be obtained following de Bruijn 

graph convention by left shifting the representation one digit and inserting any digit 
from right. Table 1 shows the presence of all such nodes in (i+l)* phase whose 
insertion is not yet achieved. 
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2.3. Definitions: 

Real Parent: The node to be inserted in the ith phase is of the form 

ao°ai'a 2 ^ ^k-l-l'^ ^ ^ ’ The real parent of the node consists 

of the set of nodes RP{x}, where x is obtained by shifting the node bit patterns by one 
bit to the right and inserting a z(e Z) bit to the leftmost position left blank by the shift 
process. Each x has got a bit pattern zai'a 2 ^. . . ai;.i_i’^'‘''x k-i+i*‘ ‘ * 

Real Children: The node to be inserted in the ith phase is of the form ao°ai*a 2 ^. . .a^.i. 

^ Zk-i'^"’ . The real children of the node consists of the set of 

nodes RC{x}, where x is obtained by shifting the node bit patterns by one bit to 

the left and inserting a z (e Z) bit to rightmost position, which has been left blank by 

the shift process. Each x has got a bit pattern ai°a 2 ’... ak.i_i'‘'‘'^x k-i'‘''''z k-i+i*^'' Zk-i'^’ 

2 

Z. 

Temporary children: The node to be inserted in ith phase is of the form : 

ao’’ai'a 2 ^. . Xk_i'‘'’Zk.i+i'^'^ Zk.i'^'* . Temporary children of a node consists 

of the set TC{x}, where x is obtained by a two-step process: 

(i) d is subtracted from the leftmost x-digit of the node bit pattern. 

(ii) The resultant bit pattern of the above step is shifted to the left by one bit, and a 
z (e Z) bit is inserted to the rightmost position left blank by the shift operation. 

Each X thus has a bit pattern as follows: ai°a 2 ’... ak.i.i’^"''^ (Xk-i'^'^-d) z 

Zk-i ’‘"^2 where z in (k-1)* position of the pattern may take any value from Z. 

2.4 Lemma 3: At least d and at most 2d number of temporary children of a node to 
be inserted always exist. 

Proof: As referred in the definition of temporary children in 2.3, (xk.i’‘ ‘'’-d) belongs 
to A. So, representation of the temporary children of the node to be inserted in i* 
phase takes the form : ai°a 2 ’...ak.i’‘''''zk.i+i’‘'‘ Zk.i'^'^*, where * in k-lth position of 

the pattern belongs to Z. Table 1 & table 2 shows that at least d and at most 2d 
number of such nodes are already present in the network. Hence, the theorem is 
proved. 

2.5 Insertion Strategy 

1. Nodes are selected in sequence as described by Table 1 and Table 2. 

2. Each node is connected with at least d number of real parents (ref 2.1). 

3. Each node is connected also with it's temporary children whose numbers may vary 
from d to 2d (ref 2.3 and 2.4). If the number of available temporary children is 
less than 2d, unavailable children are marked. 

4. It is to be checked that whether the node to be inserted is the unavailable temporary 
child of some other nodes present in the network already. If it is so, then they 
would be connected by a temporary link. 

5. As obvious from table 1, nodes inserted in phase i (l<=i<=k-l) and connected to 
their temporary children, gets their real children as soon as the nodes in phase i+1 
are inserted in the network. So, insertion of a node in phase i+1 is followed by the 
additional step (other than steps 2, 3 and 4 ) of releasing the links between nodes 
inserted in phase i to their respective temporary children while connecting to their 
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real children. More precisely, insertion of a node ao°ai’... a k.i_2 

’ in stage i+1 (l<=i<=k-l) results to establish links with its real parents 
*ao'ai^... 2^-2*^ ' > (* in 0* position belongs to A) while each 

of their real parents loses its link to the respective temporary child ao^aj . a 15.1.2 
^ (Xk-i-i*^ ’ '-d)Z]5.i'‘'' to which each of them was connected in the ith 

phase of insertion. 

Case study: 

Insertion: 

Let's try to insert the nodes Ni = 002 and N2= 003 into the de bruijn network of 
Figure 1. 

Parent of Ni = *00=(000, 100,200,300), two nodes from the parent set are present in 
the original network. Children of Ni = 02*=(020, 02 1,022, 023), none of whieh is 
present. Flence temporary children of Ni are found out as follows. 

TC(Ni) =C(000)=00*=(00 1,002, 000, 003), two of the temporary children are 
present. 

Similarly PCNz) = P(003) =* 00=(000,100,200,300), and C(N2 ) = 03* = 
(030,031,032,033), none of which is present. Hence TC( N2) = C(001)= 01*= 
(010,011,012,013), hence the network becomes: 




Figure 2 : The network after the insertion of nodes 002 & 003 
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2.6. Some Important Properties of the Topology 

Some important properties of the topology has been mentioned below which can 
easily be concluded by using table 1, table 2, insertion strategy, lemma 1, lemma2 and 
lemma 3 , 

Property 1: At the end of insertion of all nodes in phase 1 all nodes in d'^ de-Bruijn 
graph get all of their real children as is present in (2d)'‘ de-Bruijn graph and the nodes 
inserted in that phase will get their all 2d number of temporary children whose 
rightmost digits constitute the set Z. 

Property 2: Insertion of all the nodes in a phase i, l<=i<k causes to have 2d number 
of temporary children for each of the nodes inserted in that phase. 

Property 3: Insertion of all nodes in ith phase (l<i<=k) results to replace all 2d 
number of temporary children with the same number of real children for the nodes 
inserted in i-lth phase. 

Property 4: During insertion of nodes in phase i, each of the nodes in phase i-1, 
l<i<=k remain connected with total 2d number of real and temporary children whose 
rightmost digits constitute the set Z. 

Property 5: 2d number of temporary children is available for each node of sub- 
phase s, l<=s< 2‘"' in phase l<i<k-l at the time of its insertion. During insertion, at 
least d number of temporary children is available for the nodes in sub-phase 
of the phase l<i<k. 

Property 6: 2d number of real children is available for each node of sub-phases s, 
l<=s< 2’^“' in phase k at the time of its insertion. During insertion, at least d number 
of real children is available for the nodes in sub-phase 2’^'' in phase k. At the end of 
insertion of all nodes in phase k , all the nodes in phase k get connected with their 2d 
number of real children. 

Property 7: During insertion of nodes in phase i, l<i<k each of the nodes inserted in 
phase less than i-1 have 2d number of real children, each of the nodes inserted in 
phase i-1 gets connected to a total of 2d number of temporary and real children whose 
rightmost digits constitute the set Z, whereas the nodes inserted in all sub-phases s, 
l<=s< 2‘"’ of the phase I have 2d number of temporary children whose rightmost 
digits also constitute the set Z. 

Property 8 iDuring insertion of nodes in phase k, each of the nodes in phase less than 
k-1 remains connected to their 2d number of real children whose rightmost digits 
constitute the set Z, each of the nodes in phase k-1 will get connected to a total of 2d 
number temporary and real children whose rightmost digit constitutes the set Z, 
whereas the nodes inserted in sub-phase s, l<=s<2'^'' of phase k will get connected 
to 2d number of real children. 

Property 9: At the end of insertion of all nodes in phase k, the nodes inserted in (k- 
1)* and k* phase gets connected to 2d number of real children, resulting to (2d)k de- 
Bruijn graph where all the nodes have their 2d number of real children. 

2.7 Lemma 4: The number of links perturbed during the process of insertion of a node 
is at most 2d. 

Proof: At each insertion of a node at most 2d number of temporary links will be 
released from its real parents to their respective temporary children. Hence, the 
theorem is proved. 
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2.8 Lemma- 5: The diameter of the proposed interconnection remains k. 

Proof: 

Case 1: Insertion of nodes in phase 1 

During insertion of nodes in phase 1 the node pattern for both the source and 
0 12 k-2 k-1 

destination is ao SLi SL2 a k -2 Zj^.i . For a source where zk-lk-l=ak-lk-l, the 

destination can be reached easily through at most k hops via real child at each hop. 
For a source where z , first hop will lead to a temporary child because 

of absence of any real child. Another at most k-1 hops will get real child at each hop 
because of their node pattern and will thus lead to the destination. 

Case 2: Insertion of nodes in phase (l<i<k) 

(a) Insertion of nodes in any sub-phase s (l<=s<2'-l) : 

In the situation as mentioned , there must exist a route connecting any source to any 
destination by at most k hops, where each hop will lead to either a real or a temporary 
child with its rightmost digit matched with the leftmost digit (not scanned yet) of the 
destination . 

(b) Insertion of nodes in sub-phase 2i-I of any phase I<i<k. 

We already observed that the nodes inserted in sub-phase s, l<=s<2I-l of the phase i 
,l<i<k , and in phases l,2,...i-l are already connected with 2d number of real or 
temporary or a combination of real and temporary children whose last digits 
constitutes the set Z. The insertion of nodes in sub-phase 2i-l of this one makes these 
nodes to be connected with at least d number of temporary children whose last digits 
constitute the set A and those are present in the sub-phase 2i-l -1 of the same phase i. 
The remaining d number of temporary children whose rightmost digits constitute the 
set X belongs to sub-phase 2i-l itself and may still be unavailable during the 
insertion. 

So, considering (a) a source not belonging to the sub-phase 2i-l and destination 
belonging to sub-phase, (b) a source belonging to this sub-phase and destination 
remaining outside this sub-phase and (c) source and destination both belonging to this 
sub-phase, it can easily be concluded from table 2, insertion strategy and parent-child 
relationship that there must exist a route between them consisting of not more than k 
hops. 

Case 3: Insertion of nodes in phase k: 

As in case 2 it can easily proved that any source and any destination in such a situation 
are always connected through a route no longer than k hops. 

2.9 Lemma 6: Degree of a node during insertion of a node varies from d to 3d. 

Proof: As obvious from table 1, table 2 , properties of the topology and lemmas 
mentioned above. 
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3. Routing of the Proposed Interconnection Topology 

A very simple routing strategy exists for the proposed topology to connect any source 
to any destination. At any moment of insertion when dk < N < (2d)k , all of the nodes 
present in the network is connected either to all available temporary children or to 2d 
number of real children or to a combination of real and temporary children whose 
rightmost digits constitute the set Z. So, the routing process starts from the source 
with the identification of its proper successor to reach its destination. It will select the 
child amongst all the children whose rightmost sub string matches most with the 
leftmost sub string of the destination pattern. In this way, the selected child becomes 
the first intermediate node in its routing to destination. The same process will 
continue again starting from the selected child and comparing the rightmost sub string 
of its children to the leftmost unused sub string of the destination pattern and 
ultimately the process will end when the destination has been reached. 

Case Study 

Let's consider the routing from S(012) to D(OOl). Now, C(012)=12*=(120,121, 
122,123) 

As seen from the figure, none of the real children of S exist. Hence the temporary 
children of S are found out as follows: TC(S)=TC(012)=C(010)=10*=(100,101, 
102,102) 

Two temporary children (100,101) are present; hence they are compared with the 
destination D (001). As the node (100) has the maximum sub string matching with D, 
it is chosen for the nest hop. 

C(100)=00*=(000, 00 1,002,003). Thus, it is seen that 001 is a child of 100, hence the 
routing comprises of: 012 -> 100 -> 001. 




Figure 3 : The route from 002 to 101 
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4. Conclusion 

Amongst all regular topologies in multihop optical networks , de-Bruijn graph is 
being considered to be a significant one due to its simple structure associated with 
simple routing scheme supporting a large number of nodes with small nodal degree 
and small diameter. The regular structures like de-Bruijn graph where total number of 
nodes is defined by some mathematical formulae is again associated with the problem 
of non-scalability. On the contrary, irregular multihop structures overcome this 
problem of non-scalability at the cost of complex routing schemes. This paper aims at 
designing an irregular scalable multihop structure based on de-Bruijn graph having 
simple routing scheme and maintaining same diameter as in de-Bruijn graph while 
perturbation in the network during insertion of nodes is maintained at a very small 
level. 
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Abstract. In order to investigate the structure of computable functions 
over (binary) trees, we define two classes of recursive tree functions by 
extending the notion of recursive functions over natural numbers in two 
different ways, and also define the class of functions computable by while- 
programs over trees. Then we show that those classes coincide with the 
class of conjugates of recursive functions over natural numbers via a 
standard coding function (between trees and natural numbers). We also 
study what happens when we change the coding function, and present 
a necessary and sufficient condition for a coding function to satisfy the 
property above mentioned. 



0 Introduction 

We consider in this paper a naive question: What are computable functions over 
trees. A simple answer for the question might be the following: a tree function 
is computable if and only if it is obtained from a computable function / over 
natural numbers as the conjugate o f o ip where (p is an appropriate coding 
function from trees to natural numbers and is the decoding function. 

Then a natural question arises: What coding functions are appropriate? 
Should not they be ‘computable’ in some sense? But then in what sense? Do 
we have to know the notion of computability of coding functions (from trees to 
natural numbers), before we study the notion of computability of tree functions? 

Another question related to the tentative answer is: Does the notion of com- 
putability of tree functions depend on the choice of the coding function or not? 

We try to answer the first question in a number of different ways, and also 
partially answer the other questions mentioned above. 

Our motivation comes from the observation that a number of tree manip- 
ulating operations are found to be useful in well-known algorithms for sorting, 
searching, etc., and also in various application areas such as natural language 
processing. These are concrete examples of interesting ‘computable’ tree opera- 
tions (or partial functions over trees), and certainly there should be many more. 
We would like to develop a structural theory of computable tree operations, 
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which hopefully will fill the gap between conventional theory of computation 
and algorithmic aspects of trees. 

In this paper, among various tree structures with or without labels, we con- 
centrate on binary trees without labels. This is because binary trees have the 
important property that any tree structure (and moreover any finite sequence 
of tree structures) can be nicely represented by binary trees without any coding 
function involved. (See, e.g., [4].) Moreover, we believe that studying unlabelled 
binary trees, a degenerate case of labelled binary trees, would be essential for 
studying the general case. 

This paper is organized as follows. In the rest of this section, we summa- 
rize some notations and known facts about computable functions over N = 
{0,1,2,...}, the set of natural numbers. In Section I, we present basic defini- 
tions about binary trees and their functions, and some of their basic properties. 
Then in Section 2 the notion of primitive recursive functions over binary trees is 
introduced, and we show that the class is equal to that of conjugates of conven- 
tional primitive recursive functions (over N) via the standard coding function 
(of binary trees with natural numbers). In Section 3, two classes of recursive tree 
functions are introduced, one of which depends on the coding function and the 
other is coding-free. Then we prove that the two classes are equal to the class 
of conjugates of conventional recursive functions (over N), and also equal to the 
class of functions computable by while-programs over binary trees. In the last 
section, we study how the choice of coding function affects the results; indeed we 
give a necessary and sufficient condition for coding functions to yield the above 
results. 

In this paper, a function from a subset of N” to N is called a function over 
N or a numeric function, and for such a function / we write / : N”-^N (using 
a special symbol ^ rather than the usual Note that in the literature these 
functions are called ‘partial’ functions and we may occasionally use that word 
to stress the point. As usual, the notation / : A” — > Y means / is a (total) 
function from X to Y . 

Recall that the class PR(N) of primitive recursive functions over N is defined 
as the least class that contains the successor function sue : N ^ N, the zero 
functions zero„ : N” ^ N and projection functions pn,i '■ N” ^ N, and is closed 
under composition and primitive recursion. 

Also recall that the class R(N) of recursive (partial) functions over N is 
defined as the least class that contains primitive recursive functions over N and 
their minimization functions, and is closed under composition. The minimization 
function jif : N”-^N of / : is defined by 

ipf){x) = x ^=> <x[f{x,y) = 0 O y = x\. 

The class R(N) is known to be equal to the class of functions computed by 
a (high-level) while-program of the form; 

input(£c); ^i; ^ 2 ; ...; output(g) 
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where each Si is a statement, which is either an assignment statement or a 
while-statement. An assignment statement is of the form 

a; := f{y) 

and a whWe- statement is of the form 

while /(y)^0 do 

where / is an arbitrary primitive recursive function (in PR(N)), x,y are vari- 
ables (to store elements of N), x is a sequence of distinct (input) variables, and 
y is any sequence of variables. Sj’s in the while-statement are statements. Then 
the (partial) function fp : N”^ N computed by a while-program P with n input 
variables is defined as usual, assuming that the initial values of all variables other 
than input variables are set to be 0. For more about the theory of computation 
and related subjects, see, e.g., [6]. 

In the literature, the notions of PR(N) and R(N) have been extended to 
functions over algebraic structures other than N (see, e.g., [1], [2], [7], [8]). The 
present work may be considered as an extention of studies of word functions in 
[2] and [7] to the case of functions over binary trees. 



1 Basic Definitions 

1.1 Definition The set T of binary trees (or simply trees) is defined recur- 
sively, as follows. 

1. nil e T. 

2. If ti,t 2 € T, then cons(ti,t 2 ) £ T. 

When t = cons(ti,t 2 ), we write left(t) = ti and right(t) = t 2 - For each t e T, we 
define the size \t\{£ N) of t recursively as 

nil] = 0, |cons(ti,t 2 )| = |fi| + K 2 I + 1- 

It is well-known (see e.g. [4]) that for each n e N the number of binary trees of 
the size n is equal to the Catalan number B{n) = 

1.2 Definition For s,t e T, we write s t if one of the following conditions 
holds; 

1. |s| < \t\. 

2. |sj = |t| and left(s) left(t). 

3. |sj = |t|, left(s) = left(t) and right(s) right(t). 
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For example, 

nil -< cons(nil, nil) -< cons(nil, cons(nil, nil)) -< cons(cons(nil, nil), nil) -< 
cons(nil, cons(nil, cons(nil, nil))) ^ cons(nil, cons(cons(nil, nil), nil)) -< 
cons(cons(nil, nil), cons(nil, nil)) -< cons(cons(nil, cons(nil, nil)), nil) -< 
cons(cons(cons(nil, nil), nil), nil) -< cons(nil, cons(nil, cons(nil, cons(nil, nil)))) -< 



It is not difficult to see that the reflexive closure ^ of ^ is a total order on T. 
(See [3].) Moreover the ordered set (T, ^) is shown to have the same order type 
as (N, <) through bijection i/{t) = card{s £ T|s ^ t}. (Note that for each t eT 
the set {s e T|s t}, being a subset of {s £ T| |s| < |t|}, has a finite cardinality 
not exceeding Y.n<\t\ B{n).) 

1.3 Definition Let us write next(t) for min{s £ T\t ~< s}, where min refers 
to the order Note that the inverse r : N — > T of the bijection : T — > N can 
be described by using next : T ^ T as 

r(n) = = next"(nil). 



1.4 Definition For each n £ N, let us write n {n, respectively) for the 
minimum (maximum) binary tree of the size n; that is, 

0 = nil, n + 1 = cons(nil, n), 

0 = nil, n + 1 = cons(n, nil). 

We also write N = {n|n £ N} and N = {?7|n £ N}. 

Using these notations, we can give recursive descriptions of functions next : 
T ^ T and z/ : T ^ N. For the proofs, see [3]. 

1.5 Lemma 

— If f £ N, then next(f) = |f| + 1. 

— If t = cons(t',t") ^ N and 

• if t" ^ N, then next(t) = cons(t', next(t")), 

• if t" £ N and t' ^ N, then next(t) = cons(next(F), jt"|), 

• If t" £ N and t' £ N, then next(t) = cons(|F| + 1, \t"\ — 1). 

Note that in the last subcase we have \t"\ > 0 since £ N and f ^ N. 
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1.6 Lemma 

— If t = nil, then v{t) = 0. 

— If t = cons(T,t"), then 

= E„<|t| B{n) 

+ T.n<\t'\{B{n) X B{\t\ - n - I)) 

+{v{t') - u{\^) X B{\t"\) 

The notion of conjugates, which plays an essential role in this paper, can be 
defined, as follows. 

1.7 Definition Given a bijection ip ■. X ^ Y and a function / : T” ^ Y 
where X and Y are arbitrary sets, we define the conjugate : X" ^ X of / 
via ip by 

U(X1,X2, = ip~'^{f{ip{xi),ip{x2), ...,ip{Xn)))- 

For simplicity, one may write ip{x) for the sequence (p{xi),p{x 2 ), ...,ip{xn)) £ 
y”, thus abbreviate as ip~^ o f op. Note that the conjugate of via is 

/• 

For example, next : T ^ T is the conjugate suCj^ of sue : N ^ N via 
: T — > N, since next(;/“^(n)) = next”+^(nil) = + 1) = i/“^(suc(7T.)), thus 

next = osuco;/ = suc,y. Also, sue = next,-, the conjugate of next via t = iy~^. 



2 Primitive Recursive Functions over T 

In this section, we define the notion of primitive recursive functions over T, and 
compare the functions with conjugates of primitive recursive functions over N. 

2.1 Definition The class PR(T) of primitive recursive functions over T is 
defined recursively, as follows. 

1. (constructors) The binary function cons : ^ T and n-ary constant 

functions nil„ : T” ^ T such that nil„(t) = nil (n > 0) belong to PR(T). 

2. (projections) pn,i ■ T” ^ T such that Pn,i{ti,t 2 , ■■■,tn) = ti belong to 
PR(T) (1 < I < n). 

3. (composition) PR(T) is closed under composition. That is, if 5 : T* ^ T 

and < 71 , 52 , •■•,<?/ : T” ^ T belong to PR(T), then so does the function 
/ : T" ^ T defined by /(t) ^ gi(t)). 

4. (primitive recursion) If g : T” ^ T and h : > T belong to PR(T), 

then so does the function / : ^ T defined by 

/ /(4,nil) = 9{t), 

\ fit, cons(F, t")) = hit, t', t", fit, t'), fit, t")). 
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We will denote the function / in 3 by <7 o gi), and the one in 4 by /i * <7. 

2.2 Examples The following functions belong to PR(T). 

1. The unary functions left : T ^ T and right : T — > T defined by 

r left(nil) = nil, 

[ left(cons(t', t")) = t', 

( right(nil) = nil, 

\ right(cons(t', t")) = t". 

2. The minimum tree function mnt : T ^ T to assign the minimum tree of the 
same size as the argument (i.e., mnt(t) = |t|) can be defined by primitive 
recursion: 



J mnt(nil) = nil, 

mnt(cons(t', t")) = cons(nil, gr(mnt(t'), mnt(t"))) 

where gr : ^ T is the function to graft the first argument at the rightmost 

leaf of the second argument; 

gr(t, nil) = t, gr(t, cons(t', t")) = cons(t', gr(t, t")). 

Likewise, we can define the maximum tree function mxt : T ^ T such that 
mxt(t) = \t\ by 

f mxt(nil) = nil, 

I mxt(cons(t', t”)) = cons(gl(mxt(t'), mxt(t")), nil) 

where 

gl(t, nil) = t, g\{t, cons(t', t")) = cons(gl(t, t'), t”). 

3. The function nil? : T ^ T to tell whether the given tree is nil or not can be 
defined by 

J nil?(nil) = true, 

I nil?(cons(t', t")) = false. 

Here we define true and false by nil and cons(nil, nil), respectively. The char- 
acteristic functions N? : T ^ T (N? : T ^ T, respectively) of the sets 
N = {n|n G N} (N = {fi\n G N}) are defined by 

r N?(nil) = true, 

I N?(cons(t', t")) = if(nil?(t'),N?(t"), false), 
r N?(nil) = true, 

\ N?(cons(t', t")) = if (nil?(t"), N?(t'), false), 

where 

J if(nil, t, s) = t, 

I if (cons(t', t"), t, s) = s. 
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4. The function next : T ^ T is primitive recursive, because 

J next(nil) = cons(nil, nil), 

I next(cons(t', t")) = if(N?(cons(t', t")), u, u) 



where 

u = cons(nil, mnt(cons(t', t"))) (= |cons(t', t")| + 1 ), 

V = if(N?(t"), if(N?(T), rci, u;2), cons(T, next(t"))), 

wi — cons(cons(nil, mnt(t')), right(mnt(t"))) {— cons(|t'| + 1 , \t"\ — 1 )), 

W2 = cons(next(t'), mnt(t")). 



In what follows, in studying the relation between conjugates of primitive re- 
cursive numeric functions and primitive recursive tree functions, we find it is 
useful to have a reasonable embedding of the class PR(N) of primitive recursive 
functions over N into the class PR(T) of primitive recursive functions over T. 

2.3 Lemma For each n-ary numeric function / £ PR(N), there exists an 
n-ary tree function / e PR(T) such that for each mi, m 2 , m„ £ N 

/(mi, m 2 , ...,m„) = /(mi,m 2 ,...,m„) . 

Proof We define tree functions / recursively, as follows: 

1 . zeron = nil„. 

suc(t) = cons(nil, t). 

: N" ^ N = : T" ^ T. 

2- go{gi,...,gi) = g o {gi, gi). 

3. h*g = h'*g where h' {t,t' ,t" , s' , s") = , s"). 

Here the notation h * g in the lefthand side stands for the numeric function 
defined by primitive recursion from g and h; that is, 

{h * g){x, 0) = g{x), {h ^ g){x, x + l) = h{x, x, {h * g){x, x)). 

Then it is easy to see by induction on PR(N) that the functions / so defined 
satisfy the required property. □ 

2.4 Lemma The conjugates cons^ : — > N, and left,-, rights : N — > N are 

primitive recursive. 



Proof By definition of cons.^) we have 

cons,-(mi, m2) = iz(cons(r(mi),T(m2))) = i^{cons{ti,t2j) = I'it) 
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where ti = r(mi) (i = 1,2) and t = cons(ii,i 2 ). Then by applying Lemma 1.6 
we get 



cons,(mi,m2) = E„</(mi)+/(™2)+i 

+ E„</(mi)(-B(n) X + f{m2)-n)) 

+ {mi-g{mi)) x B{f{m2)) 

+ (7712-5(7712)) 



where 

B{m) = (2 X 777 )! + ((m + 1)! x ml), 

f{m) = g.x < m[m < Y.n<x B{n)] ( = |t(t77)| ), 

g{m) = J2n<fim) B{n) ( = 77( |t(777)| ) = 77 (mnt(r(77l))) ). 

The functions left.^ and right.,, can be expressed as 

left^(m) = g.x < m[3y < m[conSr{x,y) = m]], 
right.,_(7n) = y,y < m\3x < m[consT-{x,y) = 777]]. 

Thus the three functions are primitive recursive. □ 

2.5 Lemma For each t e T, we write 9{t) = v{t). Then the function 0 : T — > 
T satisfies the following: 

— 0 is primitive recursive; that is, 6 e PR(T). 

— 6 gives an isomorphism (i.e., order preserving bijection) between (T, +) and 
(N, <) where < is the total order in N defined by 77 < 777 <+ 77 < 777. 

— There exists 5 : T ^ T in PR(T) such that 

Vt G T, V77 e N[0(t) = 77 <++> t = g{n)]- 
By abuse of notation, we write 6~^ for the function g. 

Proof 

— 0 : T ^ T is primitive recursive, because 

9{cons[t' ,t")) = i/{cons{t' ,t")) 

= COnSr{l^{t'), 

= C0nSr (77(t'), v{t'')) 

= conSr {9{t'),9{t")). 



(For the definition of conSr , see 2.3.) 

~ The function 9, being the compostion of two isomorphisms v : (T, +) — > 
(N, <) and _ : (N, <) — > (N, <), gives an isomorphism between (T, +) and 

(N,<). 
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— Define <7 : T ^ T by primitive recursion; 

f 5 (nil) = nil, 

\g{cons{s,tj) = next{g{tj). 

Then 5 ( 0 ) = nil, and g( n + 1 ) = g(cons(nil, n)) = next(g(n)). Therefore by 
induction g{n) = next”(nil) = r(n); thus 

t = g(n) = T(n) <;=> v{t) = n -^^=> 6{t) = n. □ 



2.6 Lemma If / € PR(N), then fu = 0 ^ o f o 9. That is. 

Proof By Lemma 2.3, we have 

U{t) = s = v{s) 

^ /(^W) = ^ = ^(s) 

^6-\mm = s. □ 

2.7 Corollary If / e PR(N), then e PR(T). 

Proof Immediate from Lemmas 2.6, 2.5 and 2.3. □ 

2.8 Theorem {/^|/ G PR(N)} = PR(T). 

Proof Since inclusion C has been proved, we now verify 

Vff gPR(T),3/gPR(N)[5 = M 

by induction on PR(T). 

1 . nil„ = (/_i o zero„ oy = (zero„),,. 

cons = o y o cons o y_i o y = (cons.r)i/ where cons.^ £ PR(N). 
[pn,i : T" ^ T) = o : N" ^ N) o : N" ^ N),. 

2- If = 50 ° {gi,-;9i) and gj = {fj)^ {j = then 

g=ifo° (/i, •••,//))..■ 



3. Let 5 = 5i * 5o and gj = {fj)v where fj G PR(N) {j = 0, 1). It suflies to 
show that the function f = g^ = yogor belongs to PR(N), since g = 
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From the definition, we have 

/(m,0) = (j/ogor)(m,0) 

= iy{g{T{m),n\\)) 

= H9o{T(m))) 

= fo{m), 

f{m, m+ 1) = {ly o g o T){m, m+ 1) 

= i'{g{T{m),cons(t' ,t”))) 

where we write cons(t', t") = T{m + 1) 
= v{gi t",g{T{m), t'),g{T{m), t"))) 



where 

m' = u{t') = !^(left(r(m + 1))) = left^(m + 1), 
m" = v{t") = ;/(right(r(m + 1))) = right^(m + 1). 

Since left^(m + l),right^(m +1) < m + 1, and left^, rights G PR(N), the 
description of / implies that / is primitive recursive, (cf. [5]) □ 



3 Recursive Functions over T 

In this section, we define two classes of recursive functions over T and the class of 
functions computable by while programs over T. Then we study their properties 
in connection with conjugates of recursive functions over N. As in the case of 
numeric functions, we call a function from a subset of T" to T a function over 
T or a tree function. 

3.1 Definition We define the class R.(T) of recursive functions over T recur- 
sively, as follows. 

1. (primitive recursive functions) PR(T) C R(T). 

2. (minimization) If / : — > T is primitive recursive, then the (partial) 

function g^f : T”^ T defined by 

(/Ux/)(i) = i Vs ^ t[/(t, s) = nil s = t] 

belongs to R(T). We call p-x/ : T”^ T the T -minimization function of / 
along the values r(0), r(l), t( 2), ... of r : N ^ T. 

3. (composition) R(T) is closed under composition. That is, if partial func- 
tions g : T and gi,g 2 , ■■■, 9i ■ T”^ T belong to R(T), then so does the 

partial function / : T defined by 

f{t) = s 3si,...,s/ e T[ 5 i(t) = si,...,g/(t) = s/, 5 r(si,...,s/) = s]. 

As before we will write g o (gi, ■■■, gi) for the function /. 
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One may wonder whether Definition 3.1.2 is the only reasonable way of defin- 
ing the minimization function. For example, how about the following as an al- 
ternative? 

3.2 Definition For a primitive recursive function / : — > T, define a 

partial function jUn/ : T”— ^ T by 

(hN/)(i) = t <t==> 3m e N[t = m A Vn < m[f{t,n) = nil <tA n = m]]. 

We call /tn/ : T the '^-minimization of / : T along the values 

0,1, 2, ... of bijection _ : N ^ N. We will write Rn(T) for the class of tree func- 
tions defined exactly as R(T) except that we replace the minimization operator 
with /tn. The (partial) functions in Rn(T) are called '^-recursive functions. 

Next, we define the notion of (high-level) while-programs over T. 

3.3 Definition A while-program over T is of the form; 

input(x); S 2 ; ...; 5^; output(y) 

where each Si is a statement, which is either an assignment statement or a 
while-statement. An assignment statement is of the form 

a; := f{y) 

and a while-statement is of the form 

while f{y) ^ nil do [S[; S 2 ; ...; S'/]. 

Here, / is an arbitrary primitive recursive function (in PR(T)), x, y are variables 
(to store trees in T), a; is a sequence of distinct (input) variables, and y is any 
sequence of variables. S/’s in the while-statement are (either assignment or while-) 
statements. 

The (partial) function computed hy a while-program P over T is defined as 
usual, assuming that the initial values of all variables other than input variables 
are set to nil, and it is denoted by fp : T”^ T where n is the number of input 
variables of the program P. 

First, we study the relation between the three classes; the class of conjugates 
of recursive functions over N, the class Rn(T) of N-recursive functions, and 
that of functions computable by while-programs over T. 

3.4 Lemma We extend the definition of / in lemma 2.3 for / e PR(N) (cf. 
2.3) to the case where / e R(N), as follows. 

— The function / for / e PR(N) is defined as before. 

— If / e PR(N), we define yf = y jss f ', that is, 

yf{t) = t t = min{m e N|/(t, m) = nil}. 
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- = g ° {9u---,9i)- 

Then the functions / where / € R(N) satisfy the following. 

1. /gRn(T). 

2. = m /(mi, m 2 , m„) = m. 

(In particular, /( mi , m„ ) is defined iff /(mi, ..., m„) is defined.) 

3. /,y = 6~^ o£o6. 

Proof By induction on R(N). (For details, see [3].) □ 

3.5 Corollary {f^\f e R(N)} C Rn(T). 

Proof The inclusion follows immediately from Lemmas 3.4 and 2.5. □ 

3.6 Lemma Rn(T) C {fp\P is a while-program }. 

Proof By induction on Rn(T). For example, the N-minimization function 
Mn/ of / e PR(T) can be computed by the while-program 

input(x); 
y ■= nil; 

while f{x, y) ^ nil do [y := cons(nil, y)]; 
output(j/) □ 



3.7 Lemma {fp\P is a while-program } C {f^\f £ R(N)}; that is, functions 
computable by while-programs over T are conjugates via i' of recursive functions 
over N. 

Proof Given a while-program P over T, we construct a while-program Q over 
N which simulate computation of P step by step. The program Q is obtained 
from P by simply replacing each primitive recursive function g (g PR(T)) in P 
with its conjugate gr = vogor (g PR(N)) and nil with 0. Thus, for assignment 
statements 



x-=g{y) becomes x:=gr{y), 
and for terminating conditions in while statements 

g{y) ^ nil becomes gr{y) ^ 0. 

Under this construction, we can observe the equivalence 

fp{ti,...,tn) =t fQ{u{ti),...,I^{tn)) = l^{t) 

between the function fp computed by while-program P over T and the func- 
tion /q computed by while-program Q over N. In other words, we have fp = 
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o /qou = {fQ)v This completes the proof, since /q G R(N). □ 

3.8 Theorem {fv\f ^ R-(N)} = Rn(T) = {fp\P is a while-program }. 
Proof Immediate from Corollary 3.5 and Lemmas 3.6 and 3.7. □ 

Next, we see the relation between the two classes of recursive functions over 

T. 



3.9 Theorem R(T) = Rn(T). 



Proof To see the inclusion C, it suffices to find a while-program to compute the 
T-minimization function of 5 G PR(T), since by definition the cfass Rn(T) 
includes PR(T) and closed under composition. But it is easy; the following while- 
program over T clearly computes the function jiTg- 

input(a;); 
y := nil; 

while g{x,y) ^ nil do [y := next(y)]; 
output(y) 



To see 3, all we need is to show that for each g G PR(T) the N-minimization 
function g-^g belongs to R(T). For the purpose, we define a new function 




g{t,n\\) 



if t G N, 
otherwise. 



and note the equivalence 



(mn5)(*) = i (MTff')(*) = i- 

This means g^g = gxg' & R-(T), since g'{t, t) = if(N?(t), g{t, t), g{t, nil)), which 
belongs to PR(T) by Examples 2.2.3. □ 



It is shown in [3] that a tree function belongs to R(T) = Rn(T) = {f^\f G 
R(N)} = {fp\P is a while-program } if and only if it is representable by (type- 
free) A-calculus. Due to space limitation, we omit the details. 



4 Choice of Coding Functions 

In this section, we study how the choice of coding function : T ^ N affects the 
cfass {/j/|/ G PR(N)} of conjugates of primitive recursive numeric functions, 
and the cfass {fu\f G R(N)} of conjugates of recursive numeric functions. The 
proof idea in the first haff of this section (4.1 - 4.3) is originally due to [2] Chap- 
ter III. 

4.1 Lemma Suppose a : N ^ N and 6 : T ^ T are bijections. Then 
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1. If suCa = a ^ o sue o a e PR(N), then a ^ e PR(N), 

2. If consh = b~^ o cons o 6 £ PR(T), then b~^ £ PR(T). 

Proof In the second case, the function b^^ can be defined by primitive recur- 
sion as 

6“^(cons(t', t")) = o conso b){b~^{t'),b~^{t”)). 

The first case is similar. □ 

4.2 Lemma Suppose a : N ^ N and 6 : T ^ T are bijections. Then 

1. a-ioPR(N)oa = PR(N) ^ a,a-i£PR(N), 

2. 6-ioPR(T)o6 = PR(T) ^ 6,6-i£PR(T). 

Here we write a~^ o PR(N)oa for the set {a ^ogoa\g £ PR(N)}, and similarly 
for 6-1 oPR(T) o6. 

Proof For the direction => of the second case, since the condition implies 
consh = &-1 o cons o 6 £ PR(T), we know 6 ^ £ PR(T) from Lemma 4.1.2. 
Since the condition can be stated as 6 o PR(T) o = PR(T), we also have 
6 £ PR(T). For the direction <=, the assumption 6, &-i £ PR(T) implies 

6oPR(T) o6-i C PR(T) and 6“i o PR(T) o 6 C PR(T), 

from which we obtain PR(T) C 6-i o PR(T) o 6 C PR(T). The first case is 
similar. □ 

Based on these facts, we show a necessary and sufficient condition for a coding 
function : T ^ N to satisfy 1/ £ PR(N)} = (iy')“^°PR.(N)oiy' = PR(T) 
(cf. Theorem 2.8). 

4.3 Theorem For any bijection : T ^ N, the following conditions are 
equivalent. 

1. (i/0-ioPR(N)oi/' = PR(T). 

2. conS(,y/)-i = ;/'oconso((/')-i £ PR(N), and suc,y' = {i'')~^osucoi'' £ PR(T). 

Proof The direction 1 => 2 is obvious. To see the direction 2^1, suppose 
: T ^ N satisfies the condition 2. Then 

jy-i o 1 /' o cons o o u e o PR(N) o u = PR(T) 

because the standard coding function : T — > N satisfies 1. Now, since = 

((iy')-i o we know o v' ^ PR(T) from Lemma 4.1.2. Similarly, since 

V o o sue o i/' o 1 £ o PR(T) o i/-i = PR(N), 
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we know v o [u') ^ G PR(N) from Lemma 4.1.1. Then it implies 
{v-^ov')-^ = {iy')-^ov 

= O {p o O p 

e p-^ O PR(N) op = PR(T). 

Thus we have {p^^ o p')^ {p^^ o p')~^ G PR(T). Hence 

PR(T) = {p^^ o p')~^ o PR(T) o [p^^ o p') by Lemma 4.2.2 
= o{po PR(T) o i/-i) o p' 

= {p )~^ o PR(N) o p' by Theorem 2.8. 

This completes the proof. □ 

Next we show that under the condition of Theorem 4.3, the choice of coding 
function does not affect the notion of recursive functions over T. More precisely, 
{U,\f G PR(N)} = PR(T) implies {/,.|/ e R(N)} = R'(T) = R(T). Here 
R' (T) is defined as R(T) except that the minimization operator /ix in the 
definition of R(T) is now replaced with fi'rp, which is defined by 

(/Ux/)(t) = t Vs s) = nil <=> s = t] 

where :<' is the total order in T induced by p'; i.e., 

s t <;=> i^'is) < v'{t). 

First we note that proofs in previous sections (in particular, those for Lem- 
mas 2.5, 3.4, 3.7 and Theorems 3.8) can be carried out as before even if we 
replace with any bijection p' satisfying the condition of Theorem 4.3. Thus for 
example we obtain 

4.4 Lemma Under the condition of Theorem 4.3, the function 0' : T — > T 
defined by 6'{t) = p'{t) satisfies the following: 



— 6' e PR(T). 

— 9' gives an isomorphism between (T, ^') and (N, <). 

— There exists a function 5 : T ^ T in PR(T) such that 

Vt G T, Vn G N[0(t) = n t = g{n)]. 

4.5 Theorem Under the condition of Theorem 4.3, 

{fu'lf G PR(N)} = Rn(T) = {fp\P is a while-program}. 
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Based on these facts, we can prove the following counterpart of Theorem 3.9 
by slightly modifying the previous reasoning. 

4.6 Theorem Under the condition of Theorem 4.3, R'(T) = Rn(T). 

Proof The inclusion C can be verified as before except that the while-program 
to compute p-xff is now replaced with the following while-program to compute 

input(x); 
y ■= nil'; 

while g{x,y) ^ nil do [y := next'(j/)]; 
output(y) 

Here we define nil' = t'( 0) and next' = suCi/' = r' o sue o v' where r' = 

To see the converse, as before all we need is to show that for any g : — > T 

in PR(T) the minimization function y^g : T belongs to R'(T). For the 

purpose, define g' by g'{t,t) = g{t,9'{tj), so that g'{t,T'{nj) = g{t,r^. Then 

(/UNff)(<) = s s = min{n|g(t,n) = nil} 

s = ;/'(min{s' € (T, ^') | g'{t, s') = nil}) 

= 0'{{g'^g'){t)). 

Thus we have gNg = 0' ° (Mx5') ^ R'(T) since 0',g' e PR(T). □ 

4.7 Corollary For a bijection : T ^ N and its inverse r' : N ^ T, the 
following conditions are equivalent. 

1. conSr' e PR(N) and suCi// € PR(T). 

2. {/.,|/ePR(N)} = PR(T). 

3. {/,n/ e PR(N)| = PR(T), and 

{U'\f e R(N)} = R'(T) = R(T) = Rn(T) = {fp\P is a while-program}. 
Proof Immediate from Theorems 4.3, 4.5 and 4.6, □ 
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1 Motivation and results 

Interval temporal logics (ITLs) were introduced in the philosophy of 
time (see [Ben95] for a survey) but have proved useful in artihcial 
intelligence and computer science [A1183, HMM83, HS91, ZHR91]. 
They provide a rich specihcation language for systems working with 
dense time (for example, [RRM93]). By now, there is a whole mena- 
gerie of ITLs. In this paper, we work with the simplest (proposi- 
tional) ITLs and discuss their decidability. 

Let T be a set linearly ordered by >. An interval [a, b] is dehned 
as usual for a,b E T, b > a. 

Formulas of the basic propositional Interval Temporal Logic C 
build on a countable collection of propositions by closing under the 
logical connectives -i, V and (“chop”). The other Boolean connec- 
tives are dehned as usual. We identify a special proposition which 
we will use to mark intervals [a, a] consisting of one point [Ven91]. 
To be more precise, the logic can be called Cq [LROO] but in this 
paper we use the simpler notation. 

Given a proposition p, a valuation V assigns intervals on which 
it is true, subject to the condition that the special proposition io is 
made true exactly on point intervals of the form [a, a] . A model is a 
pair M = ((T,>),V). 

A formula is assigned truth value inductively, the only nontrivial 
case being for the “chop” operator: 

- M,[a,b] 1= P iff b] G V(p), 

J. He and M. Sato (Eds.): ASIAN 2000, LNCS 1961, pp. 290-298, 2000. 

© Springer- Verlag Berlin Heidelberg 2000 
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- M,[a,b] 1= 0 V '0 iff Ai, [a, b]\= cp or M, [a, b] \= 

— Ai,[a,b] |= iff Af , [a, m] \= (p and Ai, [m, b] |= ip^ for some 

m such that b > m > a. 

An example formula is -<£o'~'(p^-'io, which we denote <D>(p. It 
says that cp holds somewhere within an interval. Its dual [D] cp is 
Similarly <Q>cp and <E>0 are defined to be <p^^£o and ^£A"(p 
respectively, and they specify assertions holding in a “beginning” or 
“ending” interval. As an aside, note that the proposition £q can be 
defined as [B] false. 

As an example of a formula not definable in C, we mention <A>(p, 
which holds in an interval [a, b] if (p holds in a “neighbouring” interval 
[b, c] for some c. 

These formulas come from a pioneering paper by Halpern and 
Shoham [HS91] which studied decidability questions of ITLs. This 
paper sharpens their results. 

Instead of the binary “chop” modality, Halpern and Shoham con- 
sider the logic HS with the three unary modalities <B>, <E> and <A>. 
Observe that <T)>(p is definable in HS as <B><E>0. Venema’s paper 
[Ven90] shows that chop is not definable in HS. The logic having the 
chop modality as well as the three unary modalities is called CDT 
[Ven91]. 
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We will also be interested in two sublogics BE and DD of HS. BE 
only has the modalities <B> and <E> and is also a sublogic of C. DD 
has the two modalities <D> and < D >, where < D >0 holds if 0 is 
true in an interval of which the current interval is a subinterval (a 
kind of “mirror image” of <D>). We also define D to have the single 
modality <D>. 

Halpern and Shoham [HS91] showed that the logic HS (and there- 
fore CDT) is undecidable on many reasonable kinds of ordered struc- 
tures (including the natural numbers and the reals), and raised the 
question of decidability of BE and DD. The decidability of the logic 
C has also remained open. 

In this paper, we show that BE (and therefore C) is undecid- 
able. Specifically, validity is r.e.-hard. We expect that DD can also 
be shown undecidable by a refinement of the Halpern-Shoham ar- 
gument, similar to the one below. On the other hand, the logic D 
should be decidable. 

We do not claim a great deal of originality for our proof, which 
closely follows Halpern and Shoham [HS91]. They reduce the non- 
halting problem of a Turing machine on a blank tape to the satisfia- 
bility problem of HS. Looking at their proof, we see that most of it is 
carried out in the BE sublogic. Our aim is to convert the remaining 
part to BE. 

Our proof shows that satisfiability is undecidable over the ordinal 
a; -T 1 — equivalently, over the natural numbers when intervals of the 
form [a, cxd) are allowed. This is in contrast to monadic second-order 
logic MSO[<], which Biichi showed to be decidable (in fact, over all 
countable ordinals). Our results show that a translation of formulas 
from the logic into MSO d la [Pan95, Rab98] will not work for the 
entire class of valuations. 

The logic HS was shown to have a complete axiomatization by 
Venema [Ven90]. In [Ven91], Venema showed that CDT is also com- 
plete. In a forthcoming paper, we report on the axiomatization of 
BE and C [LROO]. 
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Related work 

Many papers provide decidability results for ITL and Duration Cal- 
culus (or DC, an extension of ITL). The reason they are able to do 
this is because they work with models where the valuation function is 
restricted from being so wild as to assign an arbitrary set of intervals 
for a proposition. 

For instance, much of the DC literature [ZHR91, ZHS93] inter- 
prets a proposition’s being true in an interval as holding almost ev- 
erywhere [Bur82], that is, “most” subintervals of this interval must 
also make it true. We call models with such valuations ae-models. 
Zhou, Hansen and Sestoft encoded a proposition as a string of formu- 
las and then used regular language techniques to prove decidability 
of DC on ae-models [ZHS93]. 

Pandya [Pan95] works with an interesting variation where the 
valuation for propositions only needs to be provided for point inter- 
vals. We call these pi-models. Pandya shows decidability of DC (and 
extensions of it) on pi-models [Pan95] by embedding the logic into 
the monadic second-order logic of order MSO[<]. 

Another class considered is that of finitely varying models, where 
a proposition can only change value hnitely many times during any 
hnite interval [ZHR91]. We call these fv-models. (For instance, all 
models over a discrete time line fall into this class.) Rabinovich shows 
decidability on fv-models [Rab98], again by translation into MSO[<]. 

Like Halpern and Shoham, we are working on models where the 
valuation is unrestricted, hence none of these decidability results 
apply. 
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2 The undecidability proof 

Idea 

Halpern and Shoham represent a Tnring machine compntation as 
an infinite seqnence of IDs (instantaneons descriptions, which are 
sometimes also called confignrations of the machine). In specifying 
this seqnence, they nse the <A> modality. 

Each ID is a finite seqnence of tape cells containing a nniqne tape 
symbol, and one of the cells has additional information representing 
the head position and state of the machine. 

The most ingenions part of Halpern and Shoham’s constrnction 
lies in nsing a proposition corr ( “corresponds” ) which makes it pos- 
sible to talk abont consecntive IDs. For instance, corr wonld be trne 
of an interval which begins with the 1936th tape cell of an ID and 
ends with the 1936th tape cell of the next ID. Again, this reqnires 
nsing an <A> modality. 

Once this is done, the transition fnnction 6 of the Tnring machine 
can be respected by examining a gronp of three cells in an ID and 
determining the valne of the same three cells in the next ID. 

Onr aim is to do the coding within BE, i.e. to eliminate the nse 
of the <A> modality. We achieve this by treating the entire infinite 
compntation as being inside a dense interval [a, 6] . Now the <D> 
modality can be nsed to talk abont consecntive IDs, seqnences, etc. 



Details 

Onr claim is that the formnla computation A comp-properties below, 
parameterized by a Tnring machine, is satisfiable if and only if it does 
not halt on a blank tape. We closely follow the proof in [HS91, 8.2]. 
In particnlar, analognes of Lemmas 8.9, 8.12 and 8.14 of that paper 
hold for onr rednction. 

We assnme the TM only writes the symbols 0 and 1. Let Q 
be its set of states. The propositions of onr langnage inclnde L = 
{0, 1, (g, 0), (g, 1), (g, B) \ q E Q}. There are also other propositions 
like cell, ID etc, which we describe below. 
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The formula comp-properties enforces properties that an interval 
[a, 6] where the proposition computation holds must satisfy. Assume 
such an [a, b] \= computation. 

comp-properties = computation D 

not-contains(computation) A 
seq(ID) A seq(ID)-properties A 
<B>init-ID A ID-properties A 
cell-properties A seq(cell)-properties A 
corr-properties A obeys-5 

First, we have a generic specihcation that an interval does not 
contain a subinterval satisfying a particular proposition. For the for- 
mula above, this will show that no proper subinterval of [a, b] will 
have computation true. 

not-contains(x) = A -i<D>a; A -i<E>x 

Now we encode generic sequences using the chop operator. Hence 
we are outside BE, but we will show how to eliminate the chops later. 

seq(x)-properties = [D] (seq(x) D x '"seq(x)) A 
[D] (x D -■ seq(x)) 

comp-properties now shows that [a,b] \= I D'~'seq{I D) , and by 
repeated application, [a,b] \= I D'~'I D'~' . . . (that is, an inhnite se- 
quence of IDs). The same trick works for IDs, showing it to be an 
inhnite sequence of cells, additionally requiring that it contains a 
unique cell containing a state. 

ID-properties = [D] (ID D not-contains(ID) A seq(cell) A one-state) 
one-state = <D>state A -> <D>(<B>state A <E>state) 

Some special IDs are abbreviated: 

init-ID = ID A [D] ((cell D blank) A (state D init-state)) 

The properties of a cell are described below. We also abbreviate 
some special cells. 

cell-properties = [D] (cell D not-contains(cell) A unique- val) 
unique- val = \J I A /\ (/ D ~^m) 

l^L l^m 

cell(/) = cell A/ 
state = V cell(/) 




296 Kamal Lodaya 



init-state = \J cell(/) 

i={qo,i)eL 

blank = cell(S) V V cell(/) 

l=(q,B)EL 

Now we come to Halpern and Shoham’s ingenions corr proposi- 
tion. It is trne of an interval if and only if it starts and ends with a 
cell, and these cells are corresponding cells in consecntive IDs. 

corr-properties = [D] ( cell-rnle A ID-rnle A corr-starts A corr-ends 

A (corr D not-contains(corr))) 
cell-rnle = corr D (<B>cell A <E>cell) 
corr-starts = (<B>cell A <E>corr) D <B>corr 
corr-ends = (<E>cell A <B>corr) D <E>corr 
ID-rnle = ID cell D corr 

The last two formnlas above are simpler than in [HS91] becanse 
we are only concerned with an infinite compntation. 

Finally, the transition fnnction is respected by examining a gronp 
of three cells and determining the valne of the middle state in the 
next ID. Here is onr formnlation of Halpern and Shoham: 

3-cell(x,y,z) = cell(x) '^cell(y) '~'cell(z) 
obeys-5 = f\ [D] ( <B>corr A <B>3-cell(i,j,k) D 

<B>(corr cell(h(i,j,k)))) 

As stated earlier, the formnla computation Acomp-properties will 
encode a Tnring machine. 



Undecidability of BE 

Now we have nndecidability of C, bnt to prove that of BE we have to 
replace occnrrences of chop by operators definable within BE. First 
of all, we can rewrite onr seqnencing operation; 

seq(x)-properties = [D] (seq(x) D <B>x A [E] (x D seq(x)) A 

[D] (x D -1 seq(x)) 

The 3-cell formnla was already written by Halpern and Shoham 
in BE, as also a 2-cell formnla which we will reqnire later; 
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3-cell(x,y,z) = <B>cell(x) A <D>cell(y) A <E>cell(z) 

A [D] (cell D cell(y)) 

3-cell = \f 3-cell(x,y,z) 

x,y,z 

2-cell = <B>cell A <E>cell A [D] -■ cell 

We modify the interpretation of corr slightly; it holds if and only 
if it starts and ends with a cell, and the ending cell is the successor 
of the cell corresponding to the starting cell in the next ID. Now 
we can define the transition relation within BE. Our trick requires a 
modification of ID-rule as well; 

ID-rule = (<B>ID A <E>2-cell A [E] -■ 3-cell) D corr 
obeys-h = f\ [D] ( <B>corr A <B>3-cell(i,j,k) D 

i,j,k£L 

<E>(cell(5(i,j,k)))) 

Undecidability over u; -|- 1 

We have used a simpler argument than Halpern and Shoham’s, rep- 
resenting each ID as an interval containing an inhnite sequence of 
tape cells, and the computation as an infinite sequence of IDs. So 
our proof shows that satisfiability is undecidable over the ordinal u>^. 

On the other hand, Halpern and Shoham use markers of two 
kinds; cell-markers separating consecutive cells and ID-markers sep- 
arating consecutive IDs. This means that the entire computation 
can be represented as a single infinite sequence with each cell/ID 
delimited by the appropriate kind of marker. 

It is not difficult to add this level of detail to our proof. No- 
tice that our initial interval [a, b ] , which satisfies the proposition 
computation, has to have inside it the entire computation, hence the 
point b must be after the computation. Hence we obtain undecid- 
ability of BE over ui + 1. The same effect may be achieved if we work 
over the natural numbers by allowing intervals of the form [a, oo). 
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